Commit Graph

12617 Commits

Author SHA1 Message Date
John Baldwin
7d313e7bdb Add COMPAT_FREEBSD9 and COMPAT_FREEBSD10 options to wrap code that
provides compatability for FreeBSD 9.x and 10.x binaries.  Enable
these options in kernel configs that enable other COMPAT_FREEBSD<n>
options.
2014-10-24 19:58:24 +00:00
Alan Cox
9f10ddb8f6 Under PAE ULONG is insufficient for representing the physical memory size. Use QUAD
for "hw.physmem" so that a physical memory size greater than 4 GB can be specified
from the boot loader.
2014-10-23 21:02:40 +00:00
Roger Pau Monné
bf7313e3b7 xen: implement the privcmd user-space device
This device is only attached to priviledged domains, and allows the
toolstack to interact with Xen. The two functions of the privcmd
interface is to allow the execution of hypercalls from user-space, and
the mapping of foreign domain memory.

Sponsored by: Citrix Systems R&D

i386/include/xen/hypercall.h:
amd64/include/xen/hypercall.h:
 - Introduce a function to make generic hypercalls into Xen.

xen/interface/xen.h:
xen/interface/memory.h:
 - Import the new hypercall XENMEM_add_to_physmap_range used by
   auto-translated guests to map memory from foreign domains.

dev/xen/privcmd/privcmd.c:
 - This device has the following functions:
   - Allow user-space applications to make hypercalls into Xen.
   - Allow user-space applications to map memory from foreign domains,
     this is accomplished using the newly introduced hypercall
     (XENMEM_add_to_physmap_range).

xen/privcmd.h:
 - Public ioctl interface for the privcmd device.

x86/xen/hvm.c:
 - Remove declaration of hypercall_page, now it's declared in
   hypercall.h.

conf/files:
 - Add the privcmd device to the build process.
2014-10-22 17:07:20 +00:00
Mateusz Guzik
07b384cbe2 Eliminate unnecessary memory allocation in sys_getgroups and its ibcs2 counterpart. 2014-10-21 23:08:46 +00:00
Davide Italiano
2be111bf7d Follow up to r225617. In order to maximize the re-usability of kernel code
in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv().
This fixes a namespace collision with libc symbols.

Submitted by:   kmacy
Tested by:      make universe
2014-10-16 18:04:43 +00:00
Konstantin Belousov
36f6ca4012 MFi386 r272761.
Noted by:	Holger Hans Peter Freyther <holger at freyther.de>
Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
2014-10-11 16:17:49 +00:00
Mark Johnston
5eaae1411f Pass up the error status of minidumpsys() to its callers.
PR:		193761
Submitted by:	Conrad Meyer <conrad.meyer@isilon.com>
Sponsored by:	EMC / Isilon Storage Division
2014-10-08 20:25:21 +00:00
Konstantin Belousov
07a92f34d6 Add an argument to the x86 pmap_invalidate_cache_range() to request
forced invalidation of the cache range regardless of the presence of
self-snoop feature.  Some recent Intel GPUs in some modes are not
coherent, and dirty lines in CPU cache must be flushed before the
pages are transferred to GPU domain.

Reviewed by:	alc (previous version)
Tested by:	pho (amd64)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-10-08 16:48:03 +00:00
John Baldwin
4a872f3d4f Call initializecpucache() on the BSP for i386 in the !XEN case. This was
my bug in r271409 that I noticed while reviewing r272492.
2014-10-06 15:43:57 +00:00
Yoshihiro Takahashi
9407825cc6 Merge pc98's machdep.c into i386/i386/machdep.c. 2014-10-04 06:01:30 +00:00
Roger Pau Monné
44e06d158a msi: add Xen MSI implementation
This patch adds support for MSI interrupts when running on Xen. Apart
from adding the Xen related code needed in order to register MSI
interrupts this patch also makes the msi_init function a hook in
init_ops, so different MSI implementations can have different
initialization functions.

Sponsored by: Citrix Systems R&D

xen/interface/physdev.h:
 - Add the MAP_PIRQ_TYPE_MULTI_MSI to map multi-vector MSI to the Xen
   public interface.

x86/include/init.h:
 - Add a hook for setting custom msi_init methods.

amd64/amd64/machdep.c:
i386/i386/machdep.c:
 - Set the default msi_init hook to point to the native MSI
   initialization method.

x86/xen/pv.c:
 - Set the Xen MSI init hook when running as a Xen guest.

x86/x86/local_apic.c:
 - Call the msi_init hook instead of directly calling msi_init.

xen/xen_intr.h:
x86/xen/xen_intr.c:
 - Introduce support for registering/releasing MSI interrupts with
   Xen.
 - The MSI interrupts will use the same PIC as the IO APIC interrupts.

xen/xen_msi.h:
x86/xen/xen_msi.c:
 - Introduce a Xen MSI implementation.

x86/xen/xen_nexus.c:
 - Overwrite the default MSI hooks in the Xen Nexus to use the Xen MSI
   implementation.

x86/xen/xen_pci.c:
 - Introduce a Xen specific PCI bus that inherits from the ACPI PCI
   bus and overwrites the native MSI methods.
 - This is needed because when running under Xen the MSI messages used
   to configure MSI interrupts on PCI devices are written by Xen
   itself.

dev/acpica/acpi_pci.c:
 - Lower the quality of the ACPI PCI bus so the newly introduced Xen
   PCI bus can take over when needed.

conf/files.i386:
conf/files.amd64:
 - Add the newly created files to the build process.
2014-09-30 16:46:45 +00:00
Roger Pau Monné
c98a2727cc ddb: allow specifying the exact address of the symtab and strtab
When the FreeBSD kernel is loaded from Xen the symtab and strtab are
not loaded the same way as the native boot loader. This patch adds
three new global variables to ddb that can be used to specify the
exact position and size of those tables, so they can be directly used
as parameters to db_add_symbol_table. A new helper is introduced, so callers
that used to set ksym_start and ksym_end can use this helper to set the new
variables.

It also adds support for loading them from the Xen PVH port, that was
previously missing those tables.

Sponsored by: Citrix Systems R&D
Reviewed by:	kib

ddb/db_main.c:
 - Add three new global variables: ksymtab, kstrtab, ksymtab_size that
   can be used to specify the position and size of the symtab and
   strtab.
 - Use those new variables in db_init in order to call db_add_symbol_table.
 - Move the logic in db_init to db_fetch_symtab in order to set ksymtab,
   kstrtab, ksymtab_size from ksym_start and ksym_end.

ddb/ddb.h:
 - Add prototype for db_fetch_ksymtab.
 - Declate the extern variables ksymtab, kstrtab and ksymtab_size.

x86/xen/pv.c:
 - Add support for finding the symtab and strtab when booted as a Xen
   PVH guest. Since Xen loads the symtab and strtab as NetBSD expects
   to find them we have to adapt and use the same method.

amd64/amd64/machdep.c:
arm/arm/machdep.c:
i386/i386/machdep.c:
mips/mips/machdep.c:
pc98/pc98/machdep.c:
powerpc/aim/machdep.c:
powerpc/booke/machdep.c:
sparc64/sparc64/machdep.c:
 - Use the newly introduced db_fetch_ksymtab in order to set ksymtab,
   kstrtab and ksymtab_size.
2014-09-25 08:28:10 +00:00
Bjoern A. Zeeb
581980cff8 Re-gen after r271743 implementing most of
timer_{create,settime,gettime,getoverrun,delete}.

MFC after:		3 days
Sponsored by:		DARPA, AFRL
2014-09-18 08:40:00 +00:00
Bjoern A. Zeeb
0a041f3b47 Implement most of timer_{create,settime,gettime,getoverrun,delete}
for amd64/linux32.  Fix the entirely bogus (untested) version from
r161310 for i386/linux using the same shared code in compat/linux.

It is unclear to me if we could support more clock mappings but
the current set allows me to successfully run commercial
32bit linux software under linuxolator on amd64.

Reviewed by:		jhb
Differential Revision:	D784
MFC after:		3 days
Sponsored by:		DARPA, AFRL
2014-09-18 08:36:45 +00:00
Konstantin Belousov
490356e5b7 Presence of any VM_PROT bits in the permission argument on x86 implies
that the entry is readable and valid.

Reported by:	markj
Submitted by:	alc
Tested by:	pho (previous version), markj
MFC after:	3 days
2014-09-17 18:49:57 +00:00
John Baldwin
de2b02fc74 MFamd64: Use initializecpu() to set various model-specific registers on
AP startup and AP resume (it was already used for BSP startup and BSP
resume).
- Split code to do one-time probing of cache properties out of
  initializecpu() and into initializecpucache().  This is called once on
  the BSP during boot.
- Move enable_sse() into initializecpu().
- Call initializecpu() for AP startup instead of enable_sse() and
  manually frobbing MSR_EFER to enable PG_NX.
- Call initializecpu() when an AP resumes.  In theory this will now
  properly re-enable PG_NX in MSR_EFER when resuming a PAE kernel on
  APs.
2014-09-10 21:37:47 +00:00
John Baldwin
645b112b68 To workaround an errata on certain Pentium Pro CPUs, i386 disables
the local APIC in initializecpu() and re-enables it if the APIC code
decides to use the local APIC after all.  Rework this workaround
slightly so that initializecpu() won't re-disable the local APIC if
it is called after the APIC code re-enables the local APIC.
2014-09-10 21:25:54 +00:00
John Baldwin
eec906cf32 Move code to set various MSRs on AMD cpus out of printcpuinfo() and
into initalizecpu() instead.
2014-09-10 21:04:44 +00:00
John Baldwin
b1d735ba4c Create a separate structure for per-CPU state saved across suspend and
resume that is a superset of a pcb.  Move the FPU state out of the pcb and
into this new structure.  As part of this, move the FPU resume code on
amd64 into a C function.  This allows resumectx() to still operate only on
a pcb and more closely mirrors the i386 code.

Reviewed by:	kib (earlier version)
2014-09-06 15:23:28 +00:00
Pedro F. Giffuni
11db54f172 Apply known workarounds for modern MacBooks.
The legacy USB circuit tends to give trouble on MacBook.
While the original report covered MacBook, extend the fix
preemptively for the newer MacBookPro too.

PR:		191693
Reviewed by:	emaste
MFC after:	5 days
2014-09-05 01:06:45 +00:00
Mark Johnston
a58b4afa9f Add mrsas(4) to GENERIC for i386 and amd64.
Approved by:	ambrisko, kadesai
MFC after:	3 days
2014-09-04 21:06:33 +00:00
John Baldwin
33a50f1b0f Merge the amd64 and i386 identcpu.c into a single x86 implementation.
This brings the structured extended features mask and VT-x reporting to
i386 and Intel cache and TLB info (under bootverbose) to amd64.
2014-09-04 14:26:25 +00:00
John Baldwin
2352bbe4cd Remove a stray blank line from the Intel cache and TLB info. 2014-09-04 02:28:17 +00:00
John Baldwin
fe760cfa1a - Move blacklists of broken TSCs out of the printcpuinfo() function
and into the TSC probe routine.
- Initialize cpu_exthigh once in finishidentcpu() which is called
  before printcpuinfo() (and matches the behavior on amd64).
2014-09-04 02:25:59 +00:00
John Baldwin
2b793beefd Remove trailing whitespace. 2014-09-04 01:56:15 +00:00
John Baldwin
7fb40488d6 - Move prototypes for various functions into out of C files and into
<machine/md_var.h>.
- Move some CPU-related variables out of i386/i386/identcpu.c to
  initcpu.c to match amd64.
- Move the declaration of has_f00f_hack out of identcpu.c to machdep.c.
- Remove a misleading comment from i386/i386/initcpu.c (locore zeros
  the BSS before it calls identify_cpu()) and remove explicit zero
  assignments to reduce the diff with amd64.
2014-09-04 01:46:06 +00:00
John Baldwin
6f00c0f90f Actually save and restore FPU state on APs during suspend and resume.
Committed from:	Atom-based HP netbook after resuming in X
2014-09-03 21:17:09 +00:00
John Baldwin
0f68c1833b Save and restore FPU state across suspend and resume. In earlier revisions
of this patch, resumectx() called npxresume() directly, but that doesn't
work because resumectx() runs with a non-standard %cs selector.  Instead,
all of the FPU suspend/resume handling is done in C.

MFC after:	1 week
2014-08-30 17:48:38 +00:00
Pedro F. Giffuni
ec4a0b4408 Minor space/tab cleanups.
Most of them were ripped from the GSoC 2104
SMAP + kpatch project.
This is only a cosmetic change.

Taken from:	Oliver Pinter (op@)
MFC after:	5 days
2014-08-30 15:41:07 +00:00
John Baldwin
89871cdeb6 - Add a new structure type for the ACPI 3.0 SMAP entry that includes the
optional attributes field.
- Add a 'machdep.smap' sysctl that exports the SMAP table of the running
  system as an array of the ACPI 3.0 structure.  (On older systems, the
  attributes are given a value of zero.)  Note that the sysctl only
  exports the SMAP table if it is available via the metadata passed from
  the loader to the kernel.  If an SMAP is not available, an empty array
  is returned.
- Add a format handler for the ACPI 3.0 SMAP structure to the sysctl(8)
  binary to format the SMAP structures in a readable format similar to
  the format found in boot messages.

MFC after:	2 weeks
2014-08-29 21:25:47 +00:00
John Baldwin
4c7785afaf MFamd64: Add a machdep.bootmethod sysctl to inform the installer which
firmware method was used for booting.  This is hardcoded to BIOS on i386.

PR:		192962
Reviewed by:	nwhitehorn
MFC after:	1 week
2014-08-29 21:08:40 +00:00
John Baldwin
669eac89c5 Fix build of si(4) and enable it in LINT on amd64 and i386. 2014-08-20 16:07:17 +00:00
Konstantin Belousov
5e1d15a879 Complete r254667, do not destroy pmap lock if KVA allocation failed.
Submitted by:	Svatopluk Kraus <onwahe@gmail.com>
MFC after:	1 week
2014-08-16 08:31:25 +00:00
Gavin Atkinson
67e3b91b31 Update i386/NOTES and amd64/NOTES files to contain the complete list of
firmwares for iwn(4) and sort them.

MFC after:	1 week
2014-08-14 18:29:55 +00:00
John Baldwin
d9f3b5e7b4 Correct a comment brought over from amd64. i386 doesn't use long
mode.
2014-08-12 18:22:57 +00:00
Alan Cox
827a661da0 Change {_,}pmap_allocpte() so that they look for the flag PMAP_ENTER_NOSLEEP
instead of M_NOWAIT/M_WAITOK when deciding whether to sleep on page table
page allocation.  (The same functions in the i386/xen and mips pmap
implementations already use PMAP_ENTER_NOSLEEP.)

X-MFC with:	r269728
Sponsored by:	EMC / Isilon Storage Division
2014-08-11 17:45:41 +00:00
Konstantin Belousov
39ffa8c138 Change pmap_enter(9) interface to take flags parameter and superpage
mapping size (currently unused).  The flags includes the fault access
bits, wired flag as PMAP_ENTER_WIRED, and a new flag
PMAP_ENTER_NOSLEEP to indicate that pmap should not sleep.

For powerpc aim both 32 and 64 bit, fix implementation to ensure that
the requested mapping is created when PMAP_ENTER_NOSLEEP is not
specified, in particular, wait for the available memory required to
proceed.

In collaboration with:	alc
Tested by:	nwhitehorn (ppc aim32 and booke)
Sponsored by:	The FreeBSD Foundation and EMC / Isilon Storage Division
MFC after:	2 weeks
2014-08-08 17:12:03 +00:00
Gleb Smirnoff
c8d2ffd6a7 Merge all MD sf_buf allocators into one MI, residing in kern/subr_sfbuf.c
The MD allocators were very common, however there were some minor
differencies. These differencies were all consolidated in the MI allocator,
under ifdefs. The defines from machine/vmparam.h turn on features required
for a particular machine. For details look in the comment in sys/sf_buf.h.

As result no MD code left in sys/*/*/vm_machdep.c. Some arches still have
machine/sf_buf.h, which is usually quite small.

Tested by:	glebius (i386), tuexen (arm32), kevlo (arm32)
Reviewed by:	kib
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-08-05 09:44:10 +00:00
Alan Cox
a695d9b25b Retire pmap_change_wiring(). We have never used it to wire virtual pages.
We continue to use pmap_enter() for that.  For unwiring virtual pages, we
now use pmap_unwire(), which unwires a range of virtual addresses instead
of a single virtual page.

Sponsored by:	EMC / Isilon Storage Division
2014-08-03 20:40:51 +00:00
Marius Strobl
1c42633e90 - Copying and zeroing pages via temporary mappings involves updating the
corresponding page tables followed by accesses to the pages in question.
  This sequence is subject to the situation exactly described in the "AMD64
  Architecture Programmer's Manual Volume 2: System Programming" rev. 3.23,
  "7.3.1 Special Coherency Considerations" [1, p. 171 f.]. Therefore, issuing
  the INVLPG right after modifying the PTE bits is crucial.
  For pmap_copy_page(), this has been broken in r124956 and later on carried
  over to pmap_copy_pages() derived from the former, while all other places
  in the i386 PMAP code use the correct order of instructions in this regard.
  Fixing the latter breakage solves the problem of data corruption seen with
  unmapped I/O enabled when running at least bare metal on AMD R-268D APUs.
  However, this might also fix similar corruption reported for virtualized
  environments.
- In pmap_copy_pages(), correctly set the cache bits on the source page being
  copied. This change is thought to be a NOP for the real world, though. [2]

1: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/24593_APM_v21.pdf

Submitted by:	kib [2]
Reviewed by:	alc, kib
MFC after:	3 days
Sponsored by:	Bally Wulff Games & Entertainment GmbH
2014-07-24 10:08:02 +00:00
Mark Johnston
291624fdf6 Invoke the DTrace trap handler before calling trap() on amd64. This matches
the upstream implementation and helps ensure that a trap induced by tracing
fbt::trap:entry is handled without recursively generating another trap.

This makes it possible to run most (but not all) of the DTrace tests under
common/safety/ without triggering a kernel panic.

Submitted by:	Anton Rang <anton.rang@isilon.com> (original version)
Phabric:	D95
2014-07-14 04:38:17 +00:00
Konstantin Belousov
a5244bac2e Correct si_code for the SIGBUS signal generated by the alignment trap.
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-08 08:05:42 +00:00
Alan Cox
09132ba6ac Introduce pmap_unwire(). It will replace pmap_change_wiring(). There are
several reasons for this change:

pmap_change_wiring() has never (in my memory) been used to set the wired
attribute on a virtual page.  We have always used pmap_enter() to do that.
Moreover, it is not really safe to use pmap_change_wiring() to set the wired
attribute on a virtual page.  The description of pmap_change_wiring() says
that it assumes the existence of a mapping in the pmap.  However, non-wired
mappings may be reclaimed by the pmap at any time.  (See pmap_collect().)
Many implementations of pmap_change_wiring() will crash if the mapping does
not exist.

pmap_unwire() accepts a range of virtual addresses, whereas
pmap_change_wiring() acts upon a single virtual page.  Since we are
typically unwiring a range of virtual addresses, pmap_unwire() will be more
efficient.  Moreover, pmap_unwire() allows us to unwire superpage mappings.
Previously, we were forced to demote the superpage mapping, because
pmap_change_wiring() only allowed us to express the unwiring of a single
base page mapping at a time.  This added to the overhead of unwiring for
large ranges of addresses, including the implicit unwiring that occurs at
process termination.

Implementations for arm and powerpc will follow.

Discussed with:	jeff, marcel
Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2014-07-06 17:42:38 +00:00
Ed Maste
ccbb7b5e19 Add vt(4) devices and options to NOTES
Reviewed by:	marius (earlier version)
2014-07-01 00:22:54 +00:00
Ed Maste
30dbb3eabc Add vt(4) to GENERIC and retire the separate VT config
vt(4) and sc(4) can now coexist in the same kernel.  To choose the vt
driver, set the loader tunable kern.vty=vt .
2014-06-30 16:18:38 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Konstantin Belousov
633034fe0e Add FPU_KERN_KTHR flag to fpu_kern_enter(9), which avoids saving FPU
context into memory for the kernel threads which called
fpu_kern_thread(9).  This allows the fpu_kern_enter() callers to not
check for is_fpu_kern_thread() to get the optimization.

Apply the flag to padlock(4) and aesni(4).  In aesni_cipher_process(),
do not leak FPU context state on error.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-06-23 07:37:54 +00:00
Dmitry Chagin
2dedc1281a Revert r266925 as it can lead to instant panic at fexecve():
To allow to run the interpreter itself add a new ELF branding type.

Pointed out by:	kib, mjg
2014-06-17 05:29:18 +00:00