Commit Graph

116 Commits

Author SHA1 Message Date
Attilio Rao
dc6fbf6545 * Completely Remove the option STOP_NMI from the kernel. This option
has proven to have a good effect when entering KDB by using a NMI,
but it completely violates all the good rules about interrupts
disabled while holding a spinlock in other occasions.  This can be the
cause of deadlocks on events where a normal IPI_STOP is expected.
* Adds an new IPI called IPI_STOP_HARD on all the supported architectures.
This IPI is responsible for sending a stop message among CPUs using a
privileged channel when disponible. In other cases it just does match a
normal IPI_STOP.
Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64
architectures, while on the other has a normal IPI_STOP effect. It is
responsibility of maintainers to eventually implement an hard stop
when necessary and possible.
* Use the new IPI facility in order to implement a new userend SMP kernel
function called stop_cpus_hard(). That is specular to stop_cpu() but
it does use the privileged channel for the stopping facility.
* Let KDB use the newly introduced function stop_cpus_hard() and leave
stop_cpus() for all the other cases
* Disable interrupts on CPU0 when starting the process of APs suspension.
* Style cleanup and comments adding

This patch should fix the reboot/shutdown deadlocks many users are
constantly reporting on mailing lists.

Please don't forget to update your config file with the STOP_NMI
option removal

Reviewed by:	jhb
Tested by:	pho, bz, rink
Approved by:	re (kib)
2009-08-13 17:09:45 +00:00
John Baldwin
013818111a Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses.  The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-24 13:50:29 +00:00
Alan Cox
3153e878dd Add support to the virtual memory system for configuring machine-
dependent memory attributes:

Rename vm_cache_mode_t to vm_memattr_t.  The new name reflects the
fact that there are machine-dependent memory attributes that have
nothing to do with controlling the cache's behavior.

Introduce vm_object_set_memattr() for setting the default memory
attributes that will be given to an object's pages.

Introduce and use pmap_page_{get,set}_memattr() for getting and
setting a page's machine-dependent memory attributes.  Add full
support for these functions on amd64 and i386 and stubs for them on
the other architectures.  The function pmap_page_set_memattr() is also
responsible for any other machine-dependent aspects of changing a
page's memory attributes, such as flushing the cache or updating the
direct map.  The uses include kmem_alloc_contig(), vm_page_alloc(),
and the device pager:

  kmem_alloc_contig() can now be used to allocate kernel memory with
  non-default memory attributes on amd64 and i386.

  vm_page_alloc() and the device pager will set the memory attributes
  for the real or fictitious page according to the object's default
  memory attributes.

Update the various pmap functions on amd64 and i386 that map pages to
incorporate each page's memory attributes in the mapping.

Notes: (1) Inherent to this design are safety features that prevent
the specification of inconsistent memory attributes by different
mappings on amd64 and i386.  In addition, the device pager provides a
warning when a device driver creates a fictitious page with memory
attributes that are inconsistent with the real page that the
fictitious page is an alias for. (2) Storing the machine-dependent
memory attributes for amd64 and i386 as a dedicated "int" in "struct
md_page" represents a compromise between space efficiency and the ease
of MFCing these changes to RELENG_7.

In collaboration with: jhb

Approved by:	re (kib)
2009-07-12 23:31:20 +00:00
Sam Leffler
8c393fd1f0 Cleanup ALIGNED_POINTER:
o add to platforms where it was missing (arm, i386, powerpc, sparc64, sun4v)
o define as "1" on amd64 and i386 where there is no restriction
o make the type returned consistent with ALIGN
o remove _ALIGNED_POINTER
o make associated comments consistent

Reviewed by:	bde, imp, marcel
Approved by:	re (kensmith)
2009-07-05 17:45:48 +00:00
Warner Losh
ca72c49f42 Fix copyrights to reflect the origin of these files.
Approved by:	re@ (rwatson)
2009-06-29 16:45:50 +00:00
Alan Cox
5797795f5a Correct the #endif comment.
Noticed by:	jmallett
Approved by:	re (kib)
2009-06-26 16:22:24 +00:00
Robert Watson
eb956cd041 Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs.  This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.

For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 11:45:06 +00:00
Alan Cox
e999111ae7 This change is the next step in implementing the cache control functionality
required by video card drivers.  Specifically, this change introduces
vm_cache_mode_t with an appropriate VM_CACHE_DEFAULT definition on all
architectures.  In addition, this changes adds a vm_cache_mode_t parameter
to kmem_alloc_contig() and vm_phys_alloc_contig().  These will be the
interfaces for allocating mapped kernel memory and physical memory,
respectively, with non-default cache modes.

In collaboration with:	jhb
2009-06-26 04:47:43 +00:00
Jeff Roberson
50c202c592 Implement a facility for dynamic per-cpu variables.
- Modules and kernel code alike may use DPCPU_DEFINE(),
   DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined
   PCPU_*.  Requires only one extra instruction more than PCPU_* and is
   virtually the same as __thread for builtin and much faster for shared
   objects.  DPCPU variables can be initialized when defined.
 - Modules are supported by relocating the module's per-cpu linker set
   over space reserved in the kernel.  Modules may fail to load if there
   is insufficient space available.
 - Track space available for modules with a one-off extent allocator.
   Free may block for memory to allocate space for an extent.

Reviewed by:    jhb, rwatson, kan, sam, grehan, marius, marcel, stas
2009-06-23 22:42:39 +00:00
Bjoern A. Zeeb
2aabdeb1f6 Add a .cvsignore file and along with that put an svn:ignore proprty
on the directory like we have for all other target architectures.

Discussed with:	imp (kind of)
2009-06-17 10:48:32 +00:00
Bjoern A. Zeeb
ed34ec5ed8 Make compile again using proper protoypes for
pcib_read/write_config DEVMETHOD.
2009-06-17 10:26:37 +00:00
Bjoern A. Zeeb
23678e67c0 Make compile again using the correct prototype for the
device shutdown method.
2009-06-17 10:23:25 +00:00
Warner Losh
bf4969e4ac Fix typo... bad imp. 2009-06-14 03:32:52 +00:00
Warner Losh
b0734f67fa After Marcel's change to DEFAULTS, we were bringing in a bogus copy of
uart_8250.  Remove it here since the UART on the ADM5120 isn't the
typical 16550: its completely different.
2009-06-14 02:58:56 +00:00
Warner Losh
dcd550fe2a Formatting nit. 2009-06-14 02:55:07 +00:00
Juli Mallett
e11e04c939 Fix MALTA build; some prototypes were wrong and blew up when kobj method
signature checking was turned on.
2009-06-12 22:49:35 +00:00
Alan Cox
5760d14d58 pmap_enter() *must* set PG_WRITEABLE on the given page if it creates a
mapping that permits write access.  Otherwise, pmap_remove_write() will not
remove write access from any of the page's mappings.
2009-05-23 22:05:14 +00:00
Alan Cox
56c4a67ba7 Give pmap_enter()'s third parameter the same name that it has on amd64 and
i386.  Otherwise, my next to last commit (r192628) to this file doesn't
actually compile.
2009-05-23 18:44:26 +00:00
Alan Cox
b4b264f3e9 When a page is mapped for write access on a read fault, the PTE should be
configured to trap on a write access unless *all* of the page's dirty bits
are set.
2009-05-23 18:33:22 +00:00
Alan Cox
e420c0cab7 Preset the modified bit in the PTE when pmap_enter() is called during a
write fault or while wiring a mapping that must support write access.

In general, this change should reduce the number of traps that occur for
the purpose of setting the modified bit.  More specifically, this change
should prevent traps while holding locks in a sysctl handler.  See
kern/kern_sysctl.c revisions 1.168 and 1.195 (svn r192160) for further
details.

Tested by: gonzo
2009-05-23 07:58:56 +00:00
Marcel Moolenaar
dbb95048da Add cpu_flush_dcache() for use after non-DMA based I/O so that a
possible future I-cache coherency operation can succeed. On ARM
for example the L1 cache can be (is) virtually mapped, which
means that any I/O that uses temporary mappings will not see the
I-cache made coherent. On ia64 a similar behaviour has been
observed. By flushing the D-cache, execution of binaries backed
by md(4) and/or NFS work reliably.
For Book-E (powerpc), execution over NFS exhibits SIGILL once in
a while as well, though cpu_flush_dcache() hasn't been implemented
yet.

Doing an explicit D-cache flush as part of the non-DMA based I/O
read operation eliminates the need to do it as part of the
I-cache coherency operation itself and as such avoids pessimizing
the DMA-based I/O read operations for which D-cache are already
flushed/invalidated. It also allows future optimizations whereby
the bcopy() followed by the D-cache flush can be integrated in a
single operation, which could be implemented using on-chips DMA
engines, by-passing the D-cache altogether.
2009-05-18 18:37:18 +00:00
Ulf Lilleengen
ae28ded2c8 - Fix spelling. 2009-05-16 15:21:08 +00:00
Jun Kuriyama
b3b17597ea - Use "device\t" and "options \t" for consistency. 2009-05-10 00:00:25 +00:00
Alan Cox
7b89d46e0f A variety of changes:
Reimplement "kernel_pmap" in the standard way.

Eliminate unused variables.  (These are mostly variables that were
discarded by the machine-independent layer after FreeBSD 4.x.)

Properly handle a vm_page_alloc() failure in pmap_init().

Eliminate dead or legacy (FreeBSD 4.x) code.

Eliminate unnecessary page queues locking.

Eliminate some excess white space.

Correct the synchronization of pmap_page_exists_quick().

Tested by: gonzo
2009-05-02 06:12:38 +00:00
Robert Watson
9725389e1e Don't conditionally define CACHE_LINE_SHIFT, as we anticipate sizing
a fair number of static data structures, making this an unlikely
option to try to change without also changing source code. [1]

Change default cache line size on ia64, sparc64, and sun4v to 128
bytes, as this was what rtld-elf was already using on those
platforms. [2]

Suggested by:	bde [1], jhb [2]
MFC after:	2 weeks
2009-04-20 12:59:23 +00:00
Alan Cox
1c951556bc MFamd64/i386
Introduce pmap_try_insert_pv_entry(), a function that conditionally
  creates a pv entry if the number of entries is below the high water mark
  for pv entries.

  Introduce pmap_enter_quick_locked() and use it to reimplement
  pmap_enter_object().  The old implementation was broken.  For example,
  it could block while holding a mutex lock.

  Change pmap_enter_quick_locked() to fail rather than wait if it is
  unable to allocate a page table page.  This prevents a race between
  pmap_enter_object() and the page daemon.  Specifically, an inactive
  page that is a successor to the page that was given to
  pmap_enter_quick_locked() might become a cache page while
  pmap_enter_quick_locked() waits and later pmap_enter_object() maps
  the cache page violating the invariant that cache pages are never
  mapped.  Similarly, change
  pmap_enter_quick_locked() to call pmap_try_insert_pv_entry() rather
  than pmap_insert_entry().  Generally speaking,
  pmap_enter_quick_locked() is used to create speculative mappings.  So,
  it should not try hard to allocate memory if free memory is scarce.

Tested by:	gonzo
2009-04-20 03:44:54 +00:00
Robert Watson
22037b2d2c Add description and cautionary note regarding CACHE_LINE_SIZE.
MFC after:	2 weeks
Suggested by:	alc
2009-04-19 21:26:36 +00:00
Robert Watson
a93fa8f2bb For each architecture, define CACHE_LINE_SHIFT and a derived
CACHE_LINE_SIZE constant.  These constants are intended to
over-estimate the cache line size, and be used at compile-time
when a run-time tuning alternative isn't appropriate or
available.

Defaults for all architectures are 64 bytes, except powerpc
where it is 128 bytes (used on G5 systems).

MFC after:	2 weeks
Discussed on:   arch@
2009-04-19 20:19:13 +00:00
Dmitry Chagin
cd899aad76 Fix KBI breakage by r190520 which affects older linux.ko binaries:
1) Move the new field (brand_note) to the end of the Brandinfo structure.
2) Add a new flag BI_BRAND_NOTE that indicates that the brand_note pointer
   is valid.
3) Use the brand_note field if the flag BI_BRAND_NOTE is set and as old
   modules won't have the flag set, so the new field brand_note would be
   ignored.

Suggested by:	jhb
Reviewed by:	jhb
Approved by:	kib (mentor)
MFC after:	6 days
2009-04-05 09:27:19 +00:00
Bjoern A. Zeeb
f19dafc798 Mark the declaration of bus_space_map 'static' as the implementation is.
Follow one of the two most common indent schemes in this file.
This unbreaks a few mips kernel builds.
2009-03-28 23:24:34 +00:00
Konstantin Belousov
a4f2b2b0c6 Add AT_EXECPATH ELF auxinfo entry type. The value's a_ptr is a pointer
to the full path of the image that is being executed.
Increase AT_COUNT.

Remove no longer true comment about types used in Linux ELF binaries,
listed types contain FreeBSD-specific entries.

Reviewed by:	kan
2009-03-17 12:50:16 +00:00
Dmitry Chagin
32c01de21c Implement new way of branding ELF binaries by looking to a
".note.ABI-tag" section.

The search order of a brand is changed, now first of all the
".note.ABI-tag" is looked through.

Move code which fetch osreldate for ELF binary to check_note() handler.

PR:		118473
Approved by:	kib (mentor)
2009-03-13 16:40:51 +00:00
Warner Losh
5c0691d91a make loop clearer that it isn't a mistake... 2009-03-03 19:22:24 +00:00
Warner Losh
7008762a18 It appears that none of the contents of this file are necessary, so
replace the amd64-ish version with a blank version.
2009-02-15 20:05:13 +00:00
Warner Losh
77bab78951 Remove stray __P() 2009-02-15 01:12:16 +00:00
Warner Losh
f3e39d2a7b Rewrite get_pv_entry() to match expectations of the rest of the
kernel.  Rather than just kick off the page daemon, we actively retire
more mappings.  The inner loop now looks a lot like the inner loop of
pmap_remove_all.

Also, get_pv_entry can't return NULL now, so remove panic if it did.

Reviewed by:	alc@
2009-02-12 01:14:49 +00:00
Warner Losh
6bf6f10d24 pmap_kenrel() was recently deleted from pmap.h. It was still used
here, but inconsistently.  Change all instances of it to kernel_pmap
to match rest of code.
2009-02-12 01:10:53 +00:00
Alan Cox
11bd9105b6 Eliminate an unused definition. 2009-02-10 06:08:28 +00:00
Oleksandr Tymoshenko
5defb9db6b - Fix in_cksum for big-endian MIPS: use correct compile-time check. 2009-02-08 23:43:36 +00:00
Warner Losh
3beb4f7ed3 Retire NO_DMA completely. 2009-02-08 08:13:36 +00:00
Warner Losh
40a70e188b Eliminate the PMAP_INLINE macro. It isn't really used here. If we
need to bring it back, we can.
2009-01-16 08:38:03 +00:00
Warner Losh
bf969a35eb Remove unused variable.
Minor style nits.
2009-01-16 08:30:22 +00:00
Oleksandr Tymoshenko
f624d2026a - pmap_track_modified was retired in r178606. Reintroducing it was a mistake.
Spotted by: alc@
2009-01-15 23:03:27 +00:00
Warner Losh
1435181496 Reduce diffs to p4 that were the result of a mismerge on my part. 2009-01-15 19:57:45 +00:00
Oleksandr Tymoshenko
16a296f7c9 MFp4:
- Add debug output
- Fix pmap_zero_page and related places: use uncached segments and invalidate
    cache after zeroing memory.
- Do not test for modified bit if it's not neccessary
    (merged from mips-juniper p4 branch)
- Some #includes reorganization
2009-01-15 18:31:36 +00:00
Warner Losh
9703f72989 MFp4:
Remove Maxmem.  It isn't used elsewhere in the system at this point...
realmem is used instead.
2009-01-15 08:01:50 +00:00
Warner Losh
85c209c24d Call platform_reset() instead of looping forever on reboot.
# We likely need to have a default one of these that jumps to the rom boot
# address that's defined in the MIPS ISA.
2009-01-15 07:51:17 +00:00
Warner Losh
b1b26fc820 Reverse order of dumpsys and cpu_idle_wakeup to reduce diffs to p4. 2009-01-15 07:48:37 +00:00
Warner Losh
f180851318 MFp4:
Remove #if'd 0 code.  It is interfering with other diffs.
2009-01-15 07:45:40 +00:00
Oleksandr Tymoshenko
36dc08476d o Code cleanup, remove unused fields of idtpci_softc 2009-01-14 22:46:13 +00:00