Commit Graph

12801 Commits

Author SHA1 Message Date
Alan Cox
9f86aba61c Exploit r288122 to address a cosmetic issue. Since PV chunk pages don't
belong to a vm object, they can't be paged out.  Since they can't be paged
out, they are never enqueued in a paging queue.  Nonetheless, passing
PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages
are being enqueued in the inactive queue.  As of r288122, we can avoid
this false impression by passing PQ_NONE.

Submitted by:	kmacy (an earlier version)
Differential Revision:	https://reviews.freebsd.org/D1674
2015-09-26 07:18:05 +00:00
Konstantin Belousov
cff8c6f2d1 Add support for weak symbols to the kernel linkers. It means that
linkers no longer raise an error when undefined weak symbols are
found, but relocate as if the symbol value was 0.  Note that we do not
repeat the mistake of userspace dynamic linker of making the symbol
lookup prefer non-weak symbol definition over the weak one, if both
are available.  In fact, kernel linker uses the first definition
found, and ignores duplicates.

Signature of the elf_lookup() and elf_obj_lookup() functions changed
to split result/error code and the symbol address returned.
Otherwise, it is impossible to return zero address as the symbol
value, to MD relocation code.  This explains the mechanical changes in
elf_machdep.c sources.

The powerpc64 R_PPC_JMP_SLOT handler did not checked error from the
lookup() call, the patch leaves the code as is (untested).

Reported by:	glebius
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-09-20 01:27:59 +00:00
Mark Johnston
610141cebb Add stack_save_td_running(), a function to trace the kernel stack of a
running thread.

It is currently implemented only on amd64 and i386; on these
architectures, it is implemented by raising an NMI on the CPU on which
the target thread is currently running. Unlike stack_save_td(), it may
fail, for example if the thread is running in user mode.

This change also modifies the kern.proc.kstack sysctl to use this function,
so that stacks of running threads are shown in the output of "procstat -kk".
This is handy for debugging threads that are stuck in a busy loop.

Reviewed by:	bdrewery, jhb, kib
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3256
2015-09-11 03:54:37 +00:00
Mark Johnston
4db79feb8f Merge stack(9) implementations for i386 and amd64 under x86/.
Reviewed by:	jhb, kib
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3255
2015-09-11 03:24:07 +00:00
Konstantin Belousov
1fa6712471 Do not hold the process around the vm_fault() call from the trap()s.
The only operation which is prevented by the hold is the kernel stack
swapout for the faulted thread, which should be fine to allow.

Remove useless checks for NULL curproc or curproc->p_vmspace from the
trap_pfault() wrappers on x86 and powerpc.

Reviewed by:	alc (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-09-10 17:46:48 +00:00
Roger Pau Monné
e8234cfef6 preload_search_info: make sure mod is set
Add a check to preload_search_info to make sure mod is set. Most of the
callers of preload_search_info don't check that the mod parameter is
set, which can cause page faults. While at it, remove some now unnecessary
checks before calling preload_search_info.

Sponsored by:		Citrix Systems R&D
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D3440
2015-08-21 15:57:57 +00:00
Marcel Moolenaar
7ef5e8bc80 Better support memory mapped console devices, such as VGA and EFI
frame buffers and memory mapped UARTs.

1.  Delay calling cninit() until after pmap_bootstrap(). This makes
    sure we have PMAP initialized enough to add translations. Keep
    kdb_init() after cninit() so that we have console when we need
    to break into the debugger on boot.
2.  Unfortunately, the ATPIC code had be moved as well so as to
    avoid a spurious trap #30. The reason for which is not known
    at this time.
3.  In pmap_mapdev_attr(), when we need to map a device prior to the
    VM system being initialized, use virtual_avail as the KVA to map
    the device at. In particular, avoid using the direct map on amd64
    because we can't demote by virtue of not being able to allocate
    yet. Keep track of the translation.
    Re-use the translation after the VM has been initialized to not
    waste KVA and to satisfy the assumption in uart(4) that the handle
    returned for the low-level console is the same as later returned
    when the device is probed and attached.
4.  In pmap_unmapdev() remove the mapping from the table when called
    pre-init. Otherwise keep the mapping. During bus probe and attach
    device resources are mapped and unmapped multiple times, which
    would have us destroy the mapping used by the low-level console.
5.  In pmap_init(), set pmap_initialized to signal that we're not
    pre-init anymore. On amd64, bring the direct map in sync with the
    translations created at that time.
6.  Implement bus_space_map() and bus_space_unmap() for real: when
    the tag corresponds to memory space, call the corresponding
    pmap_mapdev() and pmap_unmapdev() functions to construct and
    actual handle.
7.  In efifb.c and vt_vga.c, remove the crutches and hacks and simply
    call pmap_mapdev_attr() or bus_space_map() as desired.

Notes:
1.  uart(4) already used bus_space_map() during low-level console
    setup but since serial ports have traditionally been I/O port
    based, the lack of a proper implementation for said function
    was not a problem. It has always supported memory mapped UARTs
    for low-level consoles by setting hw.uart.console accordingly.
2.  The use of the direct map on amd64 without setting caching
    attributes has been a bigger problem than previously thought.
    This change has the fortunate (and unexpected) side-effect of
    fixing various EFI frame buffer problems (though not all).

PR: 191564, 194952

Special thanks to:
1.  XipLink, Inc -- generously donated an Intel Bay Trail E3800
    based eval board (ADLE3800PC).
2.  The FreeBSD Foundation, in particular emaste@ -- for UEFI
    support in general and testing.
3.  Everyone who tested the proposed for PR 191564.
4.  jhb@ and kib@ for being a soundboard and applying a clue bat
    if so needed.
2015-08-12 15:26:32 +00:00
Konstantin Belousov
0e190a486f Initialization of smp_tlb_wait does not require release semantic, no
data is synchronized by store/load to the variable.  The
lapic_write_icr() function ensures that store buffers are flushed
before IPI command is issued.

Discussed with:	bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-08-12 09:46:39 +00:00
Konstantin Belousov
c77d57c8b4 AP should load aps_ready with acquire semantic to see BSP updates to
the SMP structures, synchronized with the load by release store in
release_aps().

The change is formal, x86 strong memory model implicitely provided
the guarantees.

Discussed with:	bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-08-12 09:43:12 +00:00
Konstantin Belousov
edc8222303 Make kstack_pages a tunable on arm, x86, and powepc. On i386, the
initial thread stack is not adjusted by the tunable, the stack is
allocated too early to get access to the kernel environment. See
TD0_KSTACK_PAGES for the thread0 stack sizing on i386.

The tunable was tested on x86 only.  From the visual inspection, it
seems that it might work on arm and powerpc.  The arm
USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already
incorrect for the threads with non-default kstack size.  I only
changed the macros to use variable instead of constant, since I cannot
test.

On arm64, mips and sparc64, some static data structures are sized by
KSTACK_PAGES, so the tunable is disabled.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 week
2015-08-10 17:18:21 +00:00
Konstantin Belousov
c3f62b5352 Remove unused i386 header privatespace.h. For the native kernel, its
use was removed in r173592 (Nov 2007), yet Xen PV bits continued
referencing the privatespace structure, and were removed in r282274
(Apr 2015).

Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
2015-08-07 05:59:58 +00:00
John Baldwin
3c790178c5 Remove some more vestiges of the Xen PV domu support. Specifically,
use vtophys() directly instead of vtomach() and retire the no-longer-used
headers <machine/xenfunc.h> and <machine/xenvar.h>.

Reported by:	bde (stale bits in <machine/xenfunc.h>)
Reviewed by:	royger (earlier version)
Differential Revision:	https://reviews.freebsd.org/D3266
2015-08-06 17:07:21 +00:00
Ed Maste
fc8c856029 Rationalize BSD license on sys/*/include/in_cksum.h
Remove the advertising clause from the Regents of the University of
California's license, per the letter dated July 22, 1999.

Update clause numbering.
2015-08-05 19:05:12 +00:00
Konstantin Belousov
c8fbdcc10d Fix UP build after r286296, ensure that CPU_FOREACH() is defined.
Sponsored by:	The FreeBSD Foundation
2015-08-05 10:50:33 +00:00
Jason A. Harmening
713841afb2 Add two new pmap functions:
vm_offset_t pmap_quick_enter_page(vm_page_t m)
void pmap_quick_remove_page(vm_offset_t kva)

These will create and destroy a temporary, CPU-local KVA mapping of a specified page.

Guarantees:
--Will not sleep and will not fail.
--Safe to call under a non-sleepable lock or from an ithread

Restrictions:
--Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms
--Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page.
--MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page

The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm.

NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing.

Reviewed by:	kib
Approved by:	kib (mentor)
Differential Revision:	http://reviews.freebsd.org/D3013
2015-08-04 19:46:13 +00:00
Konstantin Belousov
3208d3ff46 Give large kernel stack to the initial thread . Otherwise, ZFS
overflows the stack during root mount in some configurations.

Tested by:	Fabian Keil <freebsd-listen@fabiankeil.de> (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-08-04 13:50:52 +00:00
Warner Losh
75333e6435 Add pmspvc device back to GENERIC. The issues with the device playing
grabby hands with other driver's devices has been solved.

MFC After: 3 weeks
2015-08-03 13:49:46 +00:00
Konstantin Belousov
f94cc23475 Clear the IA32_MISC_ENABLE MSR bit, which limits the max CPUID
reported, on APs.  We already did this on BSP.

Otherwise, the userspace software which depends on the features
reported by the high CPUID levels is misbehaving.  In particular, AVX
detection is non-functional, depending on which CPU thread happens to
execute when doing CPUID.  Another victim is the libthr signal
handlers interposer, which needs to save full FPU extended state.

Reported and tested by:	Andre Meiser <ortadur@web.de>
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-08-03 12:14:42 +00:00
Glen Barber
45e1c1a38d Pull pmspcv (pms(4)) from GENERIC. It has PCI ID conflicts
with ahd(4), mvs(4), and likely other drivers.

MFC after:	immediately
With hat:	re
Sponsored by:	The FreeBSD Foundation
2015-07-31 15:23:48 +00:00
Konstantin Belousov
0b6476ec5b Improve comments.
Submitted by:	bde
MFC after:	2 weeks
2015-07-30 15:47:53 +00:00
Konstantin Belousov
48cae112b5 Use private cache line for the locked nop in *mb() on i386.
Suggested by:	alc
Reviewed by:	alc, bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-07-30 00:13:20 +00:00
Konstantin Belousov
dd5b64258f MFamd64 r285934: Remove store/load (= full) barrier from the i386
atomic_load_acq_*().

Noted by:	alc (long time ago)
Reviewed by:	alc, bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-07-29 23:59:17 +00:00
Sean Bruno
79855a57e2 Remove dead functions pmap_pvdump and pads.
Differential Revision:	D3206
Submitted by:	kevin.bowling@kev009.com
Reviewed by:	alc
2015-07-29 20:47:27 +00:00
Konstantin Belousov
c48c590f63 Remove duplicate and useless declarations.
Submitted by:	bde
2015-07-22 09:12:40 +00:00
John Baldwin
9a2d6ab990 Various changes to the registers displayed in DDB for x86.
- Fix segment registers to only display the low 16 bits.
- Remove unused handlers and entries for the debug registers.
- Display xcr0 (if valid) in 'show sysregs'.
- Add '0x' prefix to MSR values to match other values in 'show sysregs'.
- MFamd64: Display various MSRs in 'show sysregs'.
- Add a 'show dbregs' to display the value of debug registers.
- Dynamically size the column width for register values to properly
  align columns on 64-bit platforms.
- Display %gs for i386 in 'show registers'.

Differential Revision:	https://reviews.freebsd.org/D2784
Reviewed by:	kib, markj
MFC after:	2 weeks
2015-07-22 01:09:02 +00:00
Mark Johnston
a5cbf8b9c0 Let the unwinder handle faults during function prologues or epilogues.
The i386 and amd64 DDB stack unwinders contain code to detect and handle
the case where the first frame is not completely set up or torn down. This
code was accidentally unused however, since db_backtrace() was never called
with a non-NULL trap frame. This change fixes that.

Also remove get_rsp() from the amd64 code. It appears to have come from
i386, which needs to take into account whether the exception triggered a
CPL switch, since SS:ESP is only pushed onto the stack if so. On amd64,
SS:RSP is pushed regardless, so get_rsp() was doing the wrong thing for
kernel-mode exceptions. As a result, we can also remove custom print
functions for these registers.

Reviewed by:	jhb
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D2881
2015-07-21 23:22:23 +00:00
Mark Johnston
f8a757d016 Improve stack unwinding on i386 and amd64 after an IP fault.
If we can't find a symbol corresponding to the faulting instruction, assume
that the previously-executed function is a call and attempt to find the
calling function using the return address on the stack. Otherwise we end
up associating the last stack frame with the current call, which is
incorrect and causes the unwinder to skip printing of the calling function,
resulting in a confusing backtrace.

Reviewed by:	jhb
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D2859
2015-07-21 23:13:11 +00:00
Mark Johnston
32cd0147fa Implement the lockstat provider using SDT(9) instead of the custom provider
in lockstat.ko. This means that lockstat probes now have typed arguments and
will utilize SDT probe hot-patching support when it arrives.

Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D2993
2015-07-19 22:14:09 +00:00
Konstantin Belousov
a8e1bc2e14 Revert bit of the r285627, locore.s does not need include of
opt_kstack_pages.h.  The asm gets the right KSTACK_PAGES from the
assym.s.

Reported by:	bz
Sponsored by:	The FreeBSD Foundation
2015-07-19 10:45:58 +00:00
Benno Rice
eacbeb2b95 Merge driver for PMC Sierra's range of SAS/SATA HBAs.
Submitted by:	Achim Leubner <Achim.Leubner@pmcs.com>
Reviewed by:	scottl
2015-07-17 23:30:43 +00:00
Konstantin Belousov
888e282ab4 When checking for the valid value of the frame pointer, verify that it
belongs to the kernel stack address range for the thread.  Right now,
code checks that new frame is not farther then KSTACK_PAGES pages from
the current frame, which allows the address to point past the top of
the stack.

Reviewed by:	andrew, emaste, markj
Differential revision:	https://reviews.freebsd.org/D3108
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-07-16 19:40:18 +00:00
Zbigniew Bodek
721555e7ee Fix KSTACK_PAGES issue when the default value was changed in KERNCONF
If KSTACK_PAGES was changed to anything alse than the default,
the value from param.h was taken instead in some places and
the value from KENRCONF in some others. This resulted in
inconsistency which caused corruption in SMP envorinment.

Ensure all places where KSTACK_PAGES are used the opt_kstack_pages.h
is included.

The file opt_kstack_pages.h could not be included in param.h
because was breaking the toolchain compilation.

Reviewed by:   kib
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3094
2015-07-16 10:46:52 +00:00
Christian Brueffer
f4c1eac7cd Spell crypto correctly. 2015-07-14 10:47:56 +00:00
Konstantin Belousov
85237e335d Convert between abridged (from FXSAVE) and unabridged (from FSAVE)
versions of the x87 tags.  The conversion is naive, used abridged tag
is converted to valid unabridged, without additional checks for zero
and special values.

Noted by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-07-10 09:20:13 +00:00
Konstantin Belousov
bcfc2be186 Duplicate the copyright from the i386/i386/machdep.c into
i386/include/frame.h after a code was moved from machdep.c to frame.h
in r284925.

Use include guards style similar to other guards.

Noted by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-07-10 09:15:06 +00:00
John-Mark Gurney
e808e13b8b Now that aesni won't reuse fpu contexts (D3016), add seatbelts to the
fpu code to prevent other reuse of the contexts in the future...

Differential Revision:        https://reviews.freebsd.org/D3015
Reviewed by:	kib, gnn
2015-07-08 19:26:36 +00:00
Konstantin Belousov
8954a9a4e6 Add the atomic_thread_fence() family of functions with intent to
provide a semantic defined by the C11 fences with corresponding
memory_order.

atomic_thread_fence_acq() gives r | r, w, where r and w are read and
write accesses, and | denotes the fence itself.

atomic_thread_fence_rel() is r, w | w.

atomic_thread_fence_acq_rel() is the combination of the acquire and
release in single operation.  Note that reads after the acq+rel fence
could be made visible before writes preceeding the fence.

atomic_thread_fence_seq_cst() orders all accesses before/after the
fence, and the fence itself is globally ordered against other
sequentially consistent atomic operations.

Reviewed by:	alc
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2015-07-08 18:12:24 +00:00
Achim Leubner
4e1bc9a039 Driver 'pmspcv' added. Supports PMC-Sierra PM8001/8081/8088/8089/8074/8076/8077 SAS/SATA HBA Controllers. 2015-07-07 13:17:02 +00:00
George V. Neville-Neil
0661a7c224 Fix up tabs vs. spaces 2015-07-04 20:31:06 +00:00
George V. Neville-Neil
3839369c03 Enable IPSEC in all GENERIC kernels.
Universe and kernel build tests passed 4 July 2015

PR:		128030
Sponsored by:	Rubicon Communications (Netgate)
2015-07-04 17:37:00 +00:00
Konstantin Belousov
6fdfd88220 Use single instance of the identical INKERNEL() and PMC_IN_KERNEL()
macros on amd64 and i386.  Move the definition to machine/param.h.
kgdb defines INKERNEL() too, the conflict is resolved by renaming kgdb
version to PINKERNEL().

On i386, correct the lowest kernel address.  After the shared page was
introduced, USRSTACK no longer points to the last user address + 1 [*]

Submitted by:	Oliver Pinter [*]
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-07-02 14:37:21 +00:00
Konstantin Belousov
b24de3ac40 Provide npx_get_fsave(9) and npx_set_fsave(9) functions to obtain and
restore the FPU state from the format of machine FSAVE area.  The
intended use is for ABI emulators to provide FSAVE-formatted FPU state
to usermode requiring it, while kernel could use FXSAVE due to
XMM/XSAVE.

The core functionality to convert from/to FXSAVE format is shared with
the fill_fpregs_xmm() and set_fpregs_xmm().  Move the later functions
to npx.c and rename them to npx_fill_fpregs_xmm() and
npx_set_fpregs_xmm().  They differ from nptx_get/set_fsave(9) since
our mcontext contains padding to be zeroed or ignored.

fill_fpregs() and set_fpregs() could be converted to use the new
interface, but there are small differences to handle.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-06-29 12:06:36 +00:00
Konstantin Belousov
f9343dacbd Move CS_SECURE() and EFL_SECURE() macros to the machine/frame.h. They
are useful for most implementations of sendsig().

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-06-29 10:35:00 +00:00
Konstantin Belousov
3ac3c0f269 Add a comment about too strong semantic of atomic_load_acq() on x86.
Submitted by:	bde
MFC after:	2 weeks
2015-06-29 09:58:40 +00:00
Konstantin Belousov
1817023775 Add x86 PT_GETFSBASE, PT_GETGSBASE machine-depended ptrace requests to
obtain the thread %fs and %gs bases.  Add x86 PT_SETFSBASE and
PT_SETGSBASE requests to set the bases from debuggers.  The set
requests, similarly to the sysarch({I386,AMD64}_SET_FSBASE),
override the corresponding segment registers.

The main purpose of the operations is to retrieve and modify the tcb
address for debuggee.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-06-29 07:07:24 +00:00
Konstantin Belousov
06d058bd23 Reduce code duplication. Add helper fill_based_sd(9) which creates a
based user data descriptor covering whole VA.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-06-29 06:59:08 +00:00
Konstantin Belousov
7626d062c3 Remove unneeded data dependency, currently imposed by
atomic_load_acq(9), on it source, for x86.

Right now, atomic_load_acq() on x86 is sequentially consistent with
other atomics, code ensures this by doing store/load barrier by
performing locked nop on the source.  Provide separate primitive
__storeload_barrier(), which is implemented as the locked nop done on
a cpu-private variable, and put __storeload_barrier() before load, to
keep seq_cst semantic but avoid introducing false dependency on the
no-modification of the source for its later use.

Note that seq_cst property of x86 atomic_load_acq() is not documented
and not carried by atomics implementations on other architectures,
although some kernel code relies on the behaviour.  This commit does
not intend to change this.

Reviewed by:	alc
Discussed with:	bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-06-28 05:04:08 +00:00
Mateusz Guzik
4da8456f0a Replace struct filedesc argument in getvnode with struct thread
This is is a step towards removal of spurious arguments.
2015-06-16 13:09:18 +00:00
John Baldwin
5eb95e11ba Report the values of x86 segment registers to remote debuggers.
While here, also report %eflags from the i386 trapframe.

Differential Revision:	https://reviews.freebsd.org/D2743
Reviewed by:	kib
Obtained from:	1 month
2015-06-12 15:14:08 +00:00
John Baldwin
9f9d9cf33e Ensure that the upper 16 bits of segment registers manually saved in
trapframes are cleared by explicitly pushing a zero and then moving
the segment register into the low 16 bits.  Certain Intel processors
treat a push of a segment register as a move of the segment register
into the low 16 bits leaving the upper 16 bits of the word in the
stack unchanged.

Reviewed by:	kib
MFC after:	1 month
2015-06-12 15:06:17 +00:00