Commit Graph

5613 Commits

Author SHA1 Message Date
Jung-uk Kim
39381048f0 Rename PCB_USER_FPU to PCB_USERFPU not to clash with a macro from fpu.h. 2010-07-29 16:41:21 +00:00
John Baldwin
a955c461ad The corrected error count field is dependent on CMCI, not TES.
MFC after:	1 week
2010-07-28 21:52:09 +00:00
Matthew D Fleming
d7854da193 Add MALLOC_DEBUG_MAXZONES debug malloc(9) option to use multiple uma
zones for each malloc bucket size.  The purpose is to isolate
different malloc types into hash classes, so that any buffer overruns
or use-after-free will usually only affect memory from malloc types in
that hash class.  This is purely a debugging tool; by varying the hash
function and tracking which hash class was corrupted, the intersection
of the hash classes from each instance will point to a single malloc
type that is being misused.  At this point inspection or memguard(9)
can be used to catch the offending code.

Add MALLOC_DEBUG_MAXZONES=8 to -current GENERIC configuration files.
The suggestion to have this on by default came from Kostik Belousov on
-arch.

This code is based on work by Ron Steinke at Isilon Systems.

Reviewed by:    -arch (mostly silence)
Reviewed by:    zml
Approved by:    zml (mentor)
2010-07-28 15:36:12 +00:00
Alan Cox
a14a949872 The interpreter name should no longer be treated as a buffer that can be
overwritten.  (This change should have been included in r210545.)

Submitted by:	kib
2010-07-28 04:47:40 +00:00
John Baldwin
a3870a1826 Very rough first cut at NUMA support for the physical page allocator. For
now it uses a very dumb first-touch allocation policy.  This will change in
the future.
- Each architecture indicates the maximum number of supported memory domains
  via a new VM_NDOMAIN parameter in <machine/vmparam.h>.
- Each cpu now has a PCPU_GET(domain) member to indicate the memory domain
  a CPU belongs to.  Domain values are dense and numbered from 0.
- When a platform supports multiple domains, the default freelist
  (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain.
  The MD code is required to populate an array of mem_affinity structures.
  Each entry in the array defines a range of memory (start and end) and a
  domain for the range.  Multiple entries may be present for a single
  domain.  The list is terminated by an entry where all fields are zero.
  This array of structures is used to split up phys_avail[] regions that
  fall in VM_FREELIST_DEFAULT into per-domain freelists.
- Each memory domain has a separate lookup-array of freelists that is
  used when fulfulling a physical memory allocation.  Right now the
  per-domain freelists are listed in a round-robin order for each domain.
  In the future a table such as the ACPI SLIT table may be used to order
  the per-domain lookup lists based on the penalty for each memory domain
  relative to a specific domain.  The lookup lists may be examined via a
  new vm.phys.lookup_lists sysctl.
- The first-touch policy is implemented by using PCPU_GET(domain) to
  pick a lookup list when allocating memory.

Reviewed by:	alc
2010-07-27 20:33:50 +00:00
Jung-uk Kim
172754036a Simplify fldcw() macro. There is no reason to use pointer here. No object
file change after this commit (verified with md5).
2010-07-26 23:20:55 +00:00
Jung-uk Kim
07c86dcf54 Add missing ldmxcsr() prototype for lint case. 2010-07-26 23:02:18 +00:00
Jung-uk Kim
30402401a7 Reduce diff against fenv.h:
Mark all inline asms as volatile for safety.  No object file change after
this commit (verified with md5).
2010-07-26 22:16:36 +00:00
Jung-uk Kim
2e50fa36a5 FNSTSW instruction can use AX register as an operand.
Obtained from:	fenv.h
2010-07-26 21:24:52 +00:00
Jung-uk Kim
9bfb10b154 Re-implement FPU suspend/resume for amd64. This removes superfluous uses
of critical_enter(9) and critical_exit(9) by fpugetregs() and fpusetregs().
Also, we do not touch PCB flags any more.

MFC after:	1 month
2010-07-26 19:53:09 +00:00
Konstantin Belousov
970eba46d5 Remove unneeded includes.
Submitted by:	alc
MFC after:	1 week
2010-07-26 14:38:51 +00:00
Konstantin Belousov
d48dda1244 Regen 2010-07-23 21:31:03 +00:00
Konstantin Belousov
0b53d1569e Remove the linux_exec_copyin_args(), freebsd32_exec_copyin_args() may
server as well. COMPAT_FREEBSD32 is a prerequisite for COMPAT_LINUX32.

Reviewed by:	alc
MFC after:	3 weeks
2010-07-23 21:30:33 +00:00
Alan Cox
69a8f9e3d1 Eliminate a little bit of duplicated code. 2010-07-23 18:58:27 +00:00
Konstantin Belousov
87d45a0392 When compat32 binary asks for the value of hw.machine_arch, report the
name of 32bit sibling architecture instead of the host one. Do the
same for hw.machine on amd64.

Add a safety belt debug.adaptive_machine_arch sysctl, to turn the
substitution off.

Reviewed by:	jhb, nwhitehorn
MFC after:	2 weeks
2010-07-22 09:13:49 +00:00
Alexander Motin
060d7431b5 Add hints for i8254 timer on i386 and amd64. Some people report about
systems with PnP/ACPI not reporting i8254 timer. In some cases it can be
fatal, as i8254 can be the only available time counter hardware. From other
side we are now heavily depend on i8254 timer and till the last time it's
init/usage was completely hardcoded. So this change just restores previous
behavior in more regular fashion.
2010-07-16 23:21:46 +00:00
Alexander Motin
fcc06be1b2 Move functions declaration to MI code, following implementation. 2010-07-15 17:49:35 +00:00
Alan Cox
7db39bcb05 Optimize pmap_remove()'s handling of PG_G mappings. Specifically,
instead of calling pmap_invalidate_page() for each PG_G mapping, call
pmap_invalidate_range() for each range of PG_G mappings.  In addition,
eliminate a redundant call to pmap_invalidate_page().  Both
pmap_remove_pte() and pmap_remove_page() called pmap_invalidate_page()
when the mapping had the PG_G attribute.  Now, only pmap_remove_page()
calls pmap_invalidate_page().  Altogether, these changes eliminate 53%
of the TLB shootdowns for a "buildworld" on a ZFS file system.  On
FFS, the reduction is 3%.

MFC after:	6 weeks
2010-07-15 16:25:51 +00:00
Bernhard Schmidt
774f94f14c - Update 6000 firmware to 9.221.4.1
- Add 6050 firmware

MFC after:	2 weeks
2010-07-15 11:26:07 +00:00
Warner Losh
1003cfe94d Remove obsolete undef of COPY_SIGCODE. It appears to have not been
used in FreeBSD in quite some time (maybe since before 4.4-lite :)

Submitted by:	bde
2010-07-13 15:06:13 +00:00
Jung-uk Kim
588697d478 Move i386-inherited logic of building ACPI headers for acpi_wakeup.c into
better places and remove intermediate makefile and shell scripts.  This
makes parallel kernel build little bit safer for amd64.
2010-07-12 21:08:35 +00:00
John Baldwin
05f81eb67d Remove a dead test. We already exclude NMI traps from this code in an
earlier condition.

MFC after:	1 week
2010-07-12 20:45:37 +00:00
Konstantin Belousov
aaa95ccb50 When switching the thread from the processor, store %dr7 content
into the pcb before disabling watchpoints. Otherwise, when the
thread is restored on a processor, watchpoints are still disabled.

Submitted by:	Tijl Coosemans <tijl coosemans org>
	(I would be much happier if Tijl commited this himself)
MFC after:	1 week
2010-07-12 19:59:15 +00:00
Alan Cox
8155e5d561 Reduce the number of global TLB shootdowns generated by pmap_qenter().
Specifically, teach pmap_qenter() to recognize the case when it is being
asked to replace a mapping with the very same mapping and not generate
a shootdown.  Unfortunately, the buffer cache commonly passes an entire
buffer to pmap_qenter() when only a subset of the mappings are changing.
For the extension of buffers in allocbuf() this was resulting in
unnecessary shootdowns.  The addition of new pages to the end of the
buffer need not and did not trigger a shootdown, but overwriting the
initial mappings with the very same mappings was seen as a change that
necessitated a shootdown.  With this change, that is no longer so.

For a "buildworld" on amd64, this change eliminates 14-15% of the
pmap_invalidate_range() shootdowns, and about 4% of the overall
shootdowns.

MFC after:	3 weeks
2010-07-10 18:22:44 +00:00
Konstantin Belousov
2680dac9e1 For both i386 and amd64 pmap,
- change the type of pm_active to cpumask_t, which it is;
- in pmap_remove_pages(), compare with PCPU(curpmap), instead of
  dereferencing the long chain of pointers [1].
For amd64 pmap, remove the unneeded checks for validity of curpmap
in pmap_activate(), since curpmap should be always valid after
r209789.

Submitted by:	alc [1]
Reviewed by:	alc
MFC after:	3 weeks
2010-07-09 20:05:56 +00:00
Alan Cox
77cb6e6f8d Correctly maintain the per-cpu field "curpmap" on amd64 just like we
do on i386.  The consequences of not doing so on amd64 became apparent
with the introduction of the COUNT_IPIS and COUNT_XINVLTLB_HITS
options.  Specifically, single-threaded applications were generating
unnecessary IPIs to shoot-down the TLB on other processors.  However,
this is clearly nonsensical because a single-threaded application is
only running on the current processor.  The reason that this happens
is that pmap_activate() is unable to properly update the old pmap's
field "pm_active" without the correct "curpmap".  So, in effect, stale
bits in "pm_active" were leading pmap_protect(), pmap_remove(),
pmap_remove_pages(), etc. to flush the TLB contents on some arbitrary
processor that wasn't even running the same application.

Reviewed by:	kib
MFC after:	3 weeks
2010-07-08 03:35:00 +00:00
Rui Paulo
80599de862 Fix style issues with the previous commit, namely
use-tab-instead-of-space and don't use underscores in macro variables.

Pointed out by:	bde
2010-07-07 12:08:58 +00:00
Kevin Lo
054d87d697 Add the u3g(4) driver. I can't find any reason why it's not here. 2010-07-07 09:23:46 +00:00
Rui Paulo
8923c96ee1 Introduce USD_{SET,GET}{BASE,LIMIT}. These help setting up the user
segment descriptor hi and lo values. Idea from Solaris.

Reviewed by:	kib
2010-07-06 16:56:27 +00:00
Alexander Motin
af565edaaa Revert r209638. After commit, there appeared to be more people who liked
previous name of stray interrupt counters, then responded to the list.
2010-07-02 17:22:15 +00:00
Alexander Motin
b7bc6aa726 Make stray irq counters have format alike to other counters. Unified format
makes string processing (for example by `systat -vm`) easier.
2010-07-01 21:58:46 +00:00
John Baldwin
fc0de8f0b6 Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to
<sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
2010-06-30 18:03:42 +00:00
Konstantin Belousov
13cedde2cb Regenerate 2010-06-28 18:17:21 +00:00
Konstantin Belousov
595473a587 Clear DF bit in eflags/rflags on the kernel entry. The i386 and amd64
ABI specifies the DF should be zero, and newer compilers do not clear
DF before using DF-sensitive instructions.

The DF clearing for signal handlers was done some time ago.

MFC after:	1 week
2010-06-23 20:44:07 +00:00
Konstantin Belousov
692add74d8 Fix bugs on pc98, use npxgetuserregs() instead of npxgetregs() for
get_fpcontext(), and npxsetuserregs() for set_fpcontext). Also,
note that usercontext is not initialized anymore in fpstate_drop().

Systematically replace references to npxgetregs() and npxsetregs()
by npxgetuserregs() and npxsetuserregs() in comments.

Noted by:	bde
2010-06-23 12:17:13 +00:00
Alexander Motin
25eb1b8c15 Some style fixes for r209371.
Submitted by:	jhb@
2010-06-22 16:20:10 +00:00
Alexander Motin
875b8844be Implement new event timers infrastructure. It provides unified APIs for
writing event timer drivers, for choosing best possible drivers by machine
independent code and for operating them to supply kernel with hardclock(),
statclock() and profclock() events in unified fashion on various hardware.

Infrastructure provides support for both per-CPU (independent for every CPU
core) and global timers in periodic and one-shot modes. MI management code
at this moment uses only periodic mode, but one-shot mode use planned for
later, as part of tickless kernel project.

For this moment infrastructure used on i386 and amd64 architectures. Other
archs are welcome to follow, while their current operation should not be
affected.

This patch updates existing drivers (i8254, RTC and LAPIC) for the new
order, and adds event timers support into the HPET driver. These drivers
have different capabilities:
 LAPIC - per-CPU timer, supports periodic and one-shot operation, may
freeze in C3 state, calibrated on first use, so may be not exactly precise.
 HPET - depending on hardware can work as per-CPU or global, supports
periodic and one-shot operation, usually provides several event timers.
 i8254 - global, limited to periodic mode, because same hardware used also
as time counter.
 RTC - global, supports only periodic mode, set of frequencies in Hz
limited by powers of 2.

Depending on hardware capabilities, drivers preferred in following orders,
either LAPIC, HPETs, i8254, RTC or HPETs, LAPIC, i8254, RTC.
User may explicitly specify wanted timers via loader tunables or sysctls:
kern.eventtimer.timer1 and kern.eventtimer.timer2.
If requested driver is unavailable or unoperational, system will try to
replace it. If no more timers available or "NONE" specified for second,
system will operate using only one timer, multiplying it's frequency by few
times and uing respective dividers to honor hz, stathz and profhz values,
set during initial setup.
2010-06-20 21:33:29 +00:00
Konstantin Belousov
9e3e64e797 Only enable kdtrace hook in the LINT on the architectures that implement it. 2010-06-18 18:51:09 +00:00
Konstantin Belousov
b376ebac6b In the ia32_{get,set}_fpcontext(), use fpu{get,set}userregs instead
of fpu{get,set}regs.

Noted by:	bde
MFC after:	1 month
2010-06-17 12:35:17 +00:00
Alexander Motin
d364638110 Merge COUNT_XINVLTLB_HITS and COUNT_IPIS kernel options from i386 to amd64.
This information can be very valuable for CPU sleep-time (and respectively
idle power consumption) optimization.

Add counters for timer-related IPIs.

Reviewed by:	jhb@ (previous version)
2010-06-17 11:54:49 +00:00
John Baldwin
61d3f0bab2 Restore the machine check register banks on resume. For banks being
monitored via CMCI, reset the interrupt threshold to 1 on resume.

Reviewed by:	jkim
MFC after:	2 weeks
2010-06-15 18:51:41 +00:00
Konstantin Belousov
07c809237a Remove two obsoleted comments, add a note about 32bit compatibility.
MFC after:	1 month
2010-06-15 18:16:04 +00:00
Konstantin Belousov
4a23ecc77e Rename CRITSECT_ASSERT to CRITICAL_ASSERT.
Suggested by:	jhb
MFC after:	1 month
2010-06-15 14:59:35 +00:00
Konstantin Belousov
997534958e Use critical sections instead of disabling local interrupts to ensure
the consistency between PCPU fpcurthread and the state of the FPU.

Explicitely assert that the calling conventions for fpudrop() are
adhered too. In cpu_thread_exit(), add missed critical section entrance.

Reviewed by:	bde
Tested by:	pho
MFC after:	1 month
2010-06-15 09:19:33 +00:00
Jung-uk Kim
c977e11721 Fix ACPI suspend/resume on amd64, which was broken since r208833.
We need actual storage for FPU state to save and restore.
2010-06-14 20:08:26 +00:00
Alexander Motin
2f9fc3899b Fix bug introduced in SVN rev 194985. When calling pic_assign_cpu()
for pre-bound IRQs during boot, submit there LAPIC ID, same as in other
places, not CPU ID.
2010-06-14 07:38:53 +00:00
John Baldwin
3aa6d94e0c Update several places that iterate over CPUs to use CPU_FOREACH(). 2010-06-11 18:46:34 +00:00
Alan Cox
9124d0d6a3 Relax one of the new assertions in pmap_enter() a little. Specifically,
allow pmap_enter() to be performed on an unmanaged page that doesn't have
VPO_BUSY set.  Having VPO_BUSY set really only matters for managed pages.
(See, for example, pmap_remove_write().)
2010-06-11 15:49:39 +00:00
Alexander Kabaev
60743cbd22 Do not require pos parameter to be zero in MAP_ANONYMOUS mmap requests
in Linux emulation layer. Linux seems to only require that pos is
page-aligned, but otherwise ignores it. Default FreeBSD mmap parameter
checking is too strict to allow some Linux binaries to run. tsMuxeR is
one example of such a binary.

Discussed with:	jhb
MFC after:	1 week
2010-06-10 17:59:47 +00:00
Alan Cox
ce18658792 Reduce the scope of the page queues lock and the number of
PG_REFERENCED changes in vm_pageout_object_deactivate_pages().
Simplify this function's inner loop using TAILQ_FOREACH(), and shorten
some of its overly long lines.  Update a stale comment.

Assert that PG_REFERENCED may be cleared only if the object containing
the page is locked.  Add a comment documenting this.

Assert that a caller to vm_page_requeue() holds the page queues lock,
and assert that the page is on a page queue.

Push down the page queues lock into pmap_ts_referenced() and
pmap_page_exists_quick().  (As of now, there are no longer any pmap
functions that expect to be called with the page queues lock held.)

Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever
be passed an unmanaged page.  Assert this rather than returning "0"
and "FALSE" respectively.

ARM:

Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH().

Push down the page queues lock inside of pmap_clearbit(), simplifying
pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write().
Additionally, this allows for avoiding the acquisition of the page
queues lock in some cases.

PowerPC/AIM:

moea*_page_exits_quick() and moea*_page_wired_mappings() will never be
called before pmap initialization is complete.  Therefore, the check
for moea_initialized can be eliminated.

Push down the page queues lock inside of moea*_clear_bit(),
simplifying moea*_clear_modify() and moea*_clear_reference().

The last parameter to moea*_clear_bit() is never used.  Eliminate it.

PowerPC/BookE:

Simplify mmu_booke_page_exists_quick()'s control flow.

Reviewed by:	kib@
2010-06-10 16:56:35 +00:00