2003 Commits

Author SHA1 Message Date
marcel
f2fb38d67f Fix previous commit: unbreak build of libkvm by including sys/systm.h
only when _KERNEL is defined.

Approved by:	re@ (implicit)
2014-09-07 21:40:14 +00:00
marcel
1e706702bd Fix the PCPU access macros. It was found that the PCPU pointer, when
held in register r13, is used outside the bounds of critical_enter()
and critical_exit() by virtue of optimizations performed by the
compiler. The net effect being that address computations of fields
in the PCPU structure could be relative to the PCPU structure of the
CPU on which the address computation was performed and not related
to the CPU that executes the actual load or store operation.
The typical failure mode being that the per-CPU cache of UMA got
corrupted due to accesses from other CPUs.

Adding more volatile decorating to the register expression does not
help. The thinking being that volatile is assumed to work on memory
references and not register references. Thus, the fix is to perform
the address computation using a volatile inline assembly statement.

Additionally, since the reference is fundamentally non-atomic on ia64
by virtue of have a distinct address computation followed by the
actual load or store operation, it is required to wrap the entire
PCPU access in a critical section.

With PCPU_GET and friends requiring curthread now that they're in a
critical section, low-level use of these macros in functions like
cpu_switch() is not possible anymore. Consequently, a second order
set of changes is needed to avoid using PCPU_GET and friends where
curthread is either not set yet, or in the process of being changed.
In those cases, explicit dereferencing of pcpup is needed. In those
cases it is also possible to do that.

This is a direct commit to stable/10.

Approved by:	re@ (marius)
2014-09-06 22:17:54 +00:00
kib
798eea1614 Fix a leak of the wired pages when unwiring of the PROT_NONE-mapped
wired region.  Rework the handling of unwire to do the it in batch,
both at pmap and object level.

All commits below are by alc.

MFC r268327:
Introduce pmap_unwire().

MFC r268591:
Implement pmap_unwire() for powerpc.

MFC r268776:
Implement pmap_unwire() for arm.

MFC r268806:
pmap_unwire(9) man page.

MFC r269134:
When unwiring a region of an address space, do not assume that the
underlying physical pages are mapped by the pmap.  This fixes a leak
of the wired pages on the unwiring of the region mapped with no access
allowed.

MFC r269339:
In the implementation of the new function pmap_unwire(), the call to
MOEA64_PVO_TO_PTE() must be performed before any changes are made to the
PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic.

MFC r269365:
Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed
by the combination of r268591 and r269134: When we attempt to add the
wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing.
(They only set the wired attribute on newly created mappings.)

MFC r269433:
Handle wiring failures in vm_map_wire() with the new functions
pmap_unwire() and vm_object_unwire().
Retire vm_fault_{un,}wire(), since they are no longer used.

MFC r269438:
Rewrite a loop in vm_map_wire() so that gcc doesn't think that the variable
"rv" is uninitialized.

MFC r269485:
Retire pmap_change_wiring().

Reviewed by:	alc
2014-09-01 07:58:15 +00:00
alc
30905aaa33 Update an assertion to reflect the changes made in r270439. This is a
direct commit to stable/10 because ia64 is no longer supported by HEAD.

Reported by:	marcel
Sponsored by:	EMC / Isilon Storage Division
2014-08-30 03:41:47 +00:00
marcel
ef25b335df Make sure the psr field in the trapframe (which holds the value of cr.ipsr)
is properly synthesized for the EPC syscall. Properly synthesized in this
case means that the bank number (BN bitfield) is set to 1. This is needed
because the move-from-PSR instruction does copy all bits! In this case
the BN bitfield was not copied.

While normally this is not a problem, because when we leave the kernel via
the EPC syscall path again, we don't actually care about the BN bitfield.
We restore PSR with a move-to-PSR instruction, which also doesn't cover
the BN bitfield.

There is however a scenario where we enter the kernel via the EPC syscall
path and leave the kernel via the exception/interrupt path. That path
uses the RFI (Return-From-Interrupt) instruction and it restores all bits.
What happens in that case is that we don't properly switch to register
bank 1 and any exception/interrupt that happens while running in bank 0
clobbers the process' (or kernel's) banked registers. This is because the
CPU switches to bank 0 on an exception/interrupt so that there are 16
general registers available for constructing a trapframe and saving the
context. Consequently: normal code should always use register bank 1.

This bug has been present since 2003 (11 years) and has been the cause
for many "unexplained" kernel panics. It says something about how often
we hit this problem on the one hand and how tricky it was to find it.

Many thanks to: clusteradm@ for enabling me to track this down!
2014-08-25 15:15:59 +00:00
kib
25782a7fab Merge the changes to pmap_enter(9) for sleep-less operation (requested
by flag).  The ia64 pmap.c changes are direct commit, since ia64 is
removed on head.

MFC r269368 (by alc):
Retire PVO_EXECUTABLE.

MFC r269728:
Change pmap_enter(9) interface to take flags parameter and superpage
mapping size (currently unused).

MFC r269759 (by alc):
Update the text of a KASSERT() to reflect the changes in r269728.

MFC r269822 (by alc):
Change {_,}pmap_allocpte() so that they look for the flag
PMAP_ENTER_NOSLEEP instead of M_NOWAIT/M_WAITOK when deciding whether
to sleep on page table page allocation.

MFC r270151 (by alc):
Replace KASSERT that no PV list locks are held with a conditional
unlock.

Reviewed by:	alc
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
2014-08-24 07:53:15 +00:00
emaste
cb99f1dd80 MFC r263815, r263872:
Move ia64 efi.h to sys in preparation for amd64 UEFI support

    Prototypes specific to ia64 have been left in this file for now, under
    __ia64__, rather than moving them to a new header under sys/ia64.
    I anticipate that (some of) the corresponding functions will be shared
    by the amd64, arm64, i386, and ia64 architectures, and we can adjust
    this as EFI support on other than ia64 continues to develop.

  Fix missed efi.h header change in r263815

Sponsored by:	The FreeBSD Foundation
2014-08-21 19:51:07 +00:00
imp
07bb5d08f3 MFC r263749,267146:
>r267146 | imp | 2014-06-05 22:08:55 -0600 (Thu, 05 Jun 2014) | 4 lines
>Restore comments accidentally removed.

>r263749 | imp | 2014-03-25 16:08:31 -0600 (Tue, 25 Mar 2014) | 18 lines
>Rather than require a makeoptions DEBUG to get debug correct,
>add it in kern.mk, but only if we're using clang. While this
>option is supported by both clang and gcc, in the future there
>may be changes to clang which change the defaults that require
>a tweak to build our kernel such that other tools in our tree
>will work. Set a good example by forcing -gdwarf-2 only for
>clang builds, and only if the user hasn't specified another
>dwarf level already. Update UPDATING to reflect the changed
>state of affairs. This also keeps us from having to update
>all the ARM kernels to add this, and also keeps us from
>in the future having to update all the MIPS kernels and is
>one less place the user will have to know to do something
>special for clang and one less thing developers will need
>to do when moving an architecture to clang.
2014-07-17 22:31:46 +00:00
marcel
780c455ef4 MFC r263380 & r268185: Add KTR events for the PMAP interface functions. 2014-07-02 23:57:55 +00:00
marcel
67e9cbff48 MFC r263323: Fix and improve exception tracing. 2014-07-02 23:47:43 +00:00
marcel
e7195b2eb0 MFC r263254: Move the implementation of kdb_cpu_trap() from <machine/kdb.h>
to machdep.c.
2014-07-02 23:37:14 +00:00
marcel
e507d756a8 MFC r263253: Don't use the ITC as the faulting address for external
interrupts.
2014-07-02 23:33:07 +00:00
marcel
7ee78fc350 MFC r263248 & r263257: In intr_event_handle() we already save and set
td_intr_frame, so don't do it also in ia64_handle_intr().
2014-07-02 23:23:18 +00:00
marcel
69fcf37dcf MFC r262726: When reading physical memory, make sure to access it using
the right memory attributes.
2014-07-02 23:12:56 +00:00
marcel
ee8a73303b MFC r259959 & r260009: Add prototypical support for minidumps. 2014-07-02 23:05:57 +00:00
marcel
a4a765f93d MFC r257484: Change PAL_PTCE_INFO related variables. 2014-07-02 22:19:59 +00:00
marcel
0fa78ab121 MFC r257477: Purge the translation cache of APs before we unleash them. 2014-07-02 22:06:31 +00:00
marcel
ccea0f0fb7 MFC 257475: Respect the kern.smp.disabled tunable. 2014-07-02 21:53:34 +00:00
ian
4c5f4bdec5 MFC 263301
In kernel config files, it is supposed to be 'options<space><tab>' not
  'options<tab><tab>', per long standing (but recently not so strictly
  enforced) convention.
2014-05-17 17:34:37 +00:00
ian
3724a25461 MFC 263036, 263059: delete advertising clause in licenses, renumber. 2014-05-17 13:59:11 +00:00
ian
8e2dd9b5e3 MFC r257854 (discussed with alc@)
As of r257209, all architectures have defined
  VM_KMEM_SIZE_SCALE.  In other words, every architecture is now
  auto-sizing the kmem arena.  This revision changes kmeminit() so
  that the definition of VM_KMEM_SIZE_SCALE becomes mandatory and
  the definition of VM_KMEM_SIZE becomes optional.

  Replace or eliminate all existing definitions of VM_KMEM_SIZE.
  With auto-sizing enabled, VM_KMEM_SIZE effectively became an
  alternate spelling for VM_KMEM_SIZE_MIN on most architectures.
  Use VM_KMEM_SIZE_MIN for clarity.
2014-05-16 01:30:30 +00:00
scottl
96be897ce1 Merge r264984
Retire smp_active.  It was racey and caused demonstrated problems with
the cpufreq code.  Replace its use with smp_started.  There's at least
one userland tool that still looks at the kern.smp.active sysctl, so
preserve it but point it to smp_started as well.

Obtained from:	Netflix, Inc.
2014-05-07 20:28:27 +00:00
ken
a354f057f0 MFC the mpr(4) driver for LSI's 12Gb SAS cards.
This includes r265236, r265237, r265241 and r265261:

  ------------------------------------------------------------------------
  r265236 | ken | 2014-05-02 14:25:09 -0600 (Fri, 02 May 2014) | 51 lines

  Bring in the mpr(4) driver for LSI's MPT3 12Gb SAS controllers.

  This is derived from the mps(4) driver, but it supports only the 12Gb
  IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.

  Some notes about this driver:
   o The 12Gb hardware can do "FastPath" I/O, and that capability is included in
     this driver.

   o WarpDrive functionality has been removed, since it isn't supported in
     the 12Gb driver interface.

   o The Scatter/Gather list handling code is significantly different between
     the 6Gb and 12Gb hardware.  The 12Gb boards support IEEE Scatter/Gather
     lists.

  Thanks to LSI for developing and testing this driver for FreeBSD.

  share/man/man4/mpr.4:
  	mpr(4) man page.

  sys/dev/mpr/*:
  	mpr(4) driver files.

  sys/modules/Makefile,
  sys/modules/mpr/Makefile:
  	Add a module Makefile for the mpr(4) driver.

  sys/conf/files:
  	Add the mpr(4) driver.

  sys/amd64/conf/GENERIC,
  sys/i386/conf/GENERIC,
  sys/mips/conf/OCTEON1,
  sys/sparc64/conf/GENERIC:
  	Add the mpr(4) driver to all config files that currently
  	have the mps(4) driver.

  sys/ia64/conf/GENERIC:
  	Add the mps(4) and mpr(4) drivers to the ia64 GENERIC
  	config file.

  sys/i386/conf/XEN:
  	Exclude the mpr module from building here.

  Submitted by:	Steve McConnell <Stephen.McConnell@lsi.com>
  Tested by:	Chris Reeves <chrisr@spectralogic.com>
  Sponsored by:	LSI, Spectra Logic
  Relnotes:	LSI 12Gb SAS driver mpr(4) added

  ------------------------------------------------------------------------
  ------------------------------------------------------------------------
  r265237 | ken | 2014-05-02 14:36:20 -0600 (Fri, 02 May 2014) | 8 lines

  Add the mpr(4) man page to the man4 Makefile.

  This should have been included in r265236.

  Submitted by:	Steve McConnell <Stephen.McConnell@lsi.com>
  MFC after:	3 days
  Sponsored by:	LSI, Spectra Logic

  ------------------------------------------------------------------------
  ------------------------------------------------------------------------
  r265241 | brueffer | 2014-05-02 15:14:28 -0600 (Fri, 02 May 2014) | 2 lines

  Use our standard SYNOPSIS wording; perform some cleanup while here.

  ------------------------------------------------------------------------
  ------------------------------------------------------------------------
  r265261 | brueffer | 2014-05-03 05:15:28 -0600 (Sat, 03 May 2014) | 2 lines

  Add a missing colon.

  ------------------------------------------------------------------------

Submitted by:	Steve McConnell <Stephen.McConnell@lsi.com>
Tested by:	Chris Reeves <chrisr@spectralogic.com>
Sponsored by:	LSI, Spectra Logic
Relnotes:	LSI 12Gb SAS driver mpr(4) added
2014-05-05 20:35:35 +00:00
tijl
e8ab551e5f MFC r263998:
Rename __wchar_t so it no longer conflicts with __wchar_t from clang 3.4
-fms-extensions.
2014-04-15 09:41:52 +00:00
marcel
4bb58e4f5d MFC r260175:
Implement atomic_swap_<type>.
2014-02-16 23:08:21 +00:00
marcel
2aad389c5a MFC r260914:
In pmap_set_pte(), make sure to enforce ordering by inserting a memory fence.
2014-02-16 22:12:13 +00:00
marcel
4262006891 MFC r260666:
In the nested TLB fault handler, for a direct-mapped address, make
sure to clear the lower 12 bits.
2014-02-16 22:01:44 +00:00
marcel
6c90e2f6f5 MFC r259244:
Allow pmap_remove_pages() to be called for physical maps not
associated with the current thread.
2014-02-16 20:26:22 +00:00
marcel
0ab80a1194 MFC r257910:
Don't enable interrupts before we call sched_throw().
2014-02-16 19:20:13 +00:00
marcel
f00acf1f3c MFC r257487:
Use LOG2_ID_PAGE_SIZE again for the identity mapping in regions 6 & 7.
2014-02-16 19:12:50 +00:00
kib
5a15f697f5 MFC r257228:
Add bus_dmamap_load_ma() function to load map with the array of
vm_pages.
2013-12-17 13:38:21 +00:00
gjb
cfff9c715e - Remove debugging from GENERIC* kernel configurations
- Enable MALLOC_PRODUCTION
- Default dumpdev=NO
- Remove UPDATING entry regarding debugging features
- Bump __FreeBSD_version to 1000500

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2013-10-10 17:59:44 +00:00
alc
88a4d0f31a The pmap function pmap_clear_reference() is no longer used. Remove it.
pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8.  Now, that call no
longer exists.

Approved by:	re (kib)
Sponsored by:	EMC / Isilon Storage Division
2013-09-20 04:30:18 +00:00
jhb
04bb6e10cd Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use
an address in the first 2GB of the process's address space.  This flag should
have the same semantics as the same flag on Linux.

To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address.  While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.

Reviewed by:	alc
Approved by:	re (kib)
2013-09-09 18:11:59 +00:00
glebius
358d3d145a On those machines, where sf_bufs do not represent any real object, make
sf_buf_alloc()/sf_buf_free() inlines, to save two calls to an absolutely
empty functions.

Reviewed by:	alc, kib, scottl
Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2013-09-06 05:37:49 +00:00
alc
aa9a7bb9e6 Significantly reduce the cost, i.e., run time, of calls to madvise(...,
MADV_DONTNEED) and madvise(..., MADV_FREE).  Specifically, introduce a new
pmap function, pmap_advise(), that operates on a range of virtual addresses
within the specified pmap, allowing for a more efficient implementation of
MADV_DONTNEED and MADV_FREE.  Previously, the implementation of
MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as
pmap_clear_reference().  Intuitively, the problem with this implementation
is that the pmap-level locks are acquired and released and the page table
traversed repeatedly, once for each resident page in the range
that was specified to madvise(2).  A more subtle flaw with the previous
implementation is that pmap_clear_reference() would clear the reference bit
on all mappings to the specified page, not just the mapping in the range
specified to madvise(2).

Since our malloc(3) makes heavy use of madvise(2), this change can have a
measureable impact.  For example, the system time for completing a parallel
"buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.

Note: This change only contains pmap_advise() implementations for a subset
of our supported architectures.  I will commit implementations for the
remaining architectures after further testing.  For now, a stub function is
sufficient because of the advisory nature of pmap_advise().

Discussed with: jeff, jhb, kib
Tested by:      pho (i386), marcel (ia64)
Sponsored by:   EMC / Isilon Storage Division
2013-08-29 15:49:05 +00:00
kib
05a9dff802 Revert r254501. Instead, reuse the type stability of the struct pmap
which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE
zone.  Initialize the pmap lock in the vmspace zone init function, and
remove pmap lock initialization and destruction from pmap_pinit() and
pmap_release().

Suggested and reviewed by:	alc (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-08-22 18:12:24 +00:00
pjd
ae1f4cb9ce Add process descriptors support to the GENERIC kernel. It is already being
used by the tools in base systems and with sandboxing more and more tools
the usage should only increase.

Submitted by:	Mariusz Zaborski <oshogbo@FreeBSD.org>
Sponsored by:	Google Summer of Code 2013
MFC after:	1 month
2013-08-18 10:21:29 +00:00
jkim
c55a3ec3ad Tidy up global locks for ACPICA. There is no functional change. 2013-08-13 21:34:03 +00:00
attilio
16c7563cf4 The soft and hard busy mechanism rely on the vm object lock to work.
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.

Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
  and vm_page_grab are being executed.  This will be very helpful
  once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag

The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff, kib
Tested by:	gavin, bapt (older version)
Tested by:	pho, scottl
2013-08-09 11:11:11 +00:00
avg
6f05d223ad follow up to r254051
- update powerpc/GENERIC64 as well, suggested by mdf
- update comments so that they make sense after the change, suggested by
  jhb

X-MFC after:	never (change specific to head)
2013-08-09 08:11:09 +00:00
avg
0255b98224 enable KDB_TRACE in GENERICs
KDB_TRACE is not an alternative to DDB/etc, they are complementary.
So I do not see any reason to not enable KDB_TRACE by default.

X-MFC after:	never (change specific to head)
2013-08-07 08:03:50 +00:00
jeff
de4ecca213 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
obrien
7999076e3e Back out r253779 & r253786. 2013-07-31 17:21:18 +00:00
obrien
721ce839c7 Decouple yarrow from random(4) device.
* Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option.
  The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow.

* random(4) device doesn't really depend on rijndael-*.  Yarrow, however, does.

* Add random_adaptors.[ch] which is basically a store of random_adaptor's.
  random_adaptor is basically an adapter that plugs in to random(4).
  random_adaptor can only be plugged in to random(4) very early in bootup.
  Unplugging random_adaptor from random(4) is not supported, and is probably a
  bad idea anyway, due to potential loss of entropy pools.
  We currently have 3 random_adaptors:
  + yarrow
  + rdrand (ivy.c)
  + nehemeiah

* Remove platform dependent logic from probe.c, and move it into
  corresponding registration routines of each random_adaptor provider.
  probe.c doesn't do anything other than picking a specific random_adaptor
  from a list of registered ones.

* If the kernel doesn't have any random_adaptor adapters present then the
  creation of /dev/random is postponed until next random_adaptor is kldload'ed.

* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
  system wide one.

Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: obrien
2013-07-29 20:26:27 +00:00
avg
4e6c4b2a36 Revert r253748,253749
This WIP should not have been committed yet.

Pointyhat to:	avg
2013-07-28 18:44:17 +00:00
avg
ea04b46439 put contents of cpu.h under _KERNEL
no userland-serviceable parts inside

MFC after:	20 days
2013-07-28 18:32:27 +00:00
marcel
d50608e9d4 In pci_cfgregread() and pci_cfgregwrite(), multiplex the domain and
bus number into the bus argument. The bus number occupies the least
significant 8 bits. The PCI domain occupies the most significant 24
bits.

On the Altix 350, the PCI domain is a required parameter, but
changing the prototype of the pci_cfgreg*() functions to include a
separate domain argument has wide-spread consequences across the
supported architectures. We'd be changing a known interface.

Multiplexing is an acceptable kluge to give us what we need with
manageable impact. Note that the PCI bus number fits in 8 bits,
so the multiplexing of the domain is a backward compatible change.
2013-07-23 03:03:17 +00:00
marcel
04d34b86e0 In ia64_mca_init(), don't limit the allocation of the info block to
fall within the first 256MB of memory. The origin/reason for that
limitation is not known, but it's not believed to be required for
proper initialization. What is known is that the Altix 350 does not
have physical memory at that address (by virtue of the address space
bits).

Keep the boundary at 256MB so that the info block will be covered
by a single direct-mapped translation.

While here, change the flags to M_NOWAIT to eliminate confusion. It
does not change the behaviour of contigmalloc(). What is does is
makes the flags argument explicitly say what the actual behaviour
is.
2013-07-23 02:38:23 +00:00
marcel
0680c1f5fe In pmap_mapdev(), if the physical memory range is not covered by an EFI
memory descriptor, don't return NULL as the virtual address, return the
direct-mapped uncacheable virtual address for it. At first, this was
needed only for the Altix 350, but now even some high-end HP machines
have devices mapped to physical addresses that aren't covered by the
EFI memory map.
2013-07-23 02:11:22 +00:00