Commit Graph

12890 Commits

Author SHA1 Message Date
Alan Cox
53aadae680 As an optimization to the machine-independent layer, change the machine-
dependent pmap_ts_referenced() so that it updates the page's dirty field
if a modified bit is found while counting reference bits.  This
opportunistic update can be performed at low cost and can eliminate the
need for some future calls to pmap_is_modified() by the machine-
independent layer.

Reviewed by:	kib, markj
MFC after:	3 weeks
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7722
2016-09-01 15:57:44 +00:00
Bruce Evans
ef209971e9 Shorten banal comments about zeroing and copying pages. Don't give
implementation details that last echoed the code 15-20 years ago.
But add a detail about pagezero() on i386.  Switch from Mach style
to BSD style.
2016-08-29 14:38:31 +00:00
Bruce Evans
1a5735873e On amd64, declare sse2_pagezero() and start using it again, but only
for zeroing pages in idle where nontemporal writes are clearly best.
This is almost a no-op since zeroing in idle works does nothing good
and is off by default.  Fix END() statement forgotten in previous
commit.

Align the loop in sse2_pagezero().  Since it writes to main memory,
the loop doesn't have to be very carefully written to keep up.
Unrolling it was considered useless or harmful and was not done on
i386, but that was too careless.

Timing for i386: the loop was not unrolled at all, and moved only 4
bytes/iteration.  So on a 2GHz CPU, it needed to run at 2 cycles/
iteration to keep up with a memory speed of just 4GB/sec.  But when
it crossed a 16-byte boundary, on old CPUs it ran at 3 cycles/
iteration so it gave a maximum speed of 2.67GB/sec and couldn't even
keep up with PC3200 memory.  Fix the alignment so that it keep up with
4GB/sec memory, and unroll once to get nearer to 8GB/sec.  Further
unrolling might be useless or harmful since it would prevent the loop
fitting in 16-bytes.  My test system with an old CPU and old DDR1 only
needed 5+ GB/sec.  My test system with a new CPU and DDR3 doesn't need
any changes to keep up ~16GB/sec.

Timing for amd64: with 8-byte accesses and newer faster CPUs it is
easy to reach 16GB/sec but not so easy to go much faster.  The
alignment doesn't matter much if the CPU is not very old.  The loop
was already unrolled 4 times, but needs 32 bytes and uses a fancy
method that doesn't work for 2-way unrolling in 16 bytes.  Just
align it to 32-bytes.
2016-08-29 13:07:21 +00:00
Bruce Evans
be1ed810b1 Fix vm86 initialization, part 1 of 2 and a half.
Early use of vm86 depends on the PIC being reset to mask interrupts,
but r286667 moved PIC initialization to after where vm86 may be first
used.

Move the PIC initialization up to immdiately before vm86 initialization.
All invocations of diff that I tried display this move poorly so that it
looks like PIC and vm86 initialization was moved later.

r286667 was to move console initialization later.  The diffs are again
unreadable -- they show a large move that doesn't seem to involve the
console.  The PIC initialization stayed just below the console
initialization where it could still be debugged but no longer works.

Later console initialization breaks mainly debugging vm86 initialization
and memory sizing using ddb and printf().  There are several printf()s
in the memory sizing that now go nowhere since message buffer
initialization has always been too late.  Memory sizing is done by loader
for most users, but the lost messages for this case are even more
interesting than for an auto-probe since they tell you what the loader
found.
2016-08-28 15:23:44 +00:00
Bruce Evans
441ead70cd Fix vm86 initialization, part 1 of 2 and a half.
vm86 uses the tss, but r273995 moved tss initialization to after where
it may be first used, just because tss_esp0 now depends on later
initializations and/or amd64 does it later.

vm86 is first used for memory sizing in cases where the loader can't
figure out the size or is not used.  Its initialization is placed
immediately before memory sizing to support this, and the tss was
initialized a little earlier.

Move everything in the tss initialization except for tss_esp0 back to
almost where it was, immediately before vm86 initialization (the
combined move is from before dblflt_tss initialization to after).  Add
only early initialization of tss_esp0, later reloading of the tss, and
comments.  The initial tss_esp0 no longer has space for the pcb since
initially the size of the pcb is not known and no pcb is needed.
(Later changes broke debugging at this point, so the nonexistent pcb
cannot be used by debuggers, and at the time of 273995 when ddb was
almost able to debug this problem it didn't need the pcb.)  The
iniitial tss_esp0 still has a magic 16 bytes reserved for vm86
although I think this is unused too.
2016-08-28 14:03:25 +00:00
Ed Schouten
48734c99d3 Convert pointers obtained from the threadattr_t structure with TO_PTR().
In all of these source files, the userspace pointer size corresponds
with the kernelspace pointer size, meaning that casting directly works.
As I'm planning on making 32-bit execution on 64-bit systems work as
well, use TO_PTR() here as well, so that the changes between source
files remain minimal.
2016-08-24 10:13:18 +00:00
John Baldwin
a47632d45b Fix build for !SMP kernels after the Xen MSIX workaround.
Move msix_disable_migration under #ifdef SMP since it doesn't make sense
for !SMP kernels.

PR:		212014
Reported by:	Glyn Grinstead <glyn@grinstead.org>
MFC after:	3 days
2016-08-22 21:23:17 +00:00
Ed Schouten
8b0a83cce2 Make CloudABI work on i386.
Copy over amd64's cloudabi64_sysvec.c into i386 and tailor it to work.
Again, we use a system call convention similar to FreeBSD, except that
there is no support for indirect system calls (%eax == 0).

Where i386 differs from amd64 is that we have to store thread/process
entry arguments on the stack instead of using registers. We also have to
put an extra pointer on the stack for TLS (for GSBASE). Place that
pointer in the empty slot that is normally used to hold return
addresses. That seems to keep the code simple.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D7590
2016-08-22 17:37:31 +00:00
John Baldwin
21768fa9c0 Remove the ie(4) driver for Intel 82586 ISA Ethernet adapters.
This driver only supports 10Mb Ethernet using PIO (the hardware supports
DMA, but the driver only does PIO).  There are not any PCCard adapters
supported by this driver, only ISA cards.  In addition, it does not use
bus_space but instead uses bcopy with volatile pointers triggering a
host of warnings.  (if_ie.c is one of 3 files always built with
-Wno-error)

Relnotes:	yes
2016-08-20 00:49:29 +00:00
John Baldwin
354b6f0fd9 Remove the spic(4) driver for the Sony Vaoi Jogdial.
This hardware is not present on any modern systems.  The driver is quite
hackish (raw inb/outb instead of bus_space, and raw inb/outb to random
I/O ports to enable ACPI since it predated proper ACPI support).

Relnotes:	yes
2016-08-19 23:39:08 +00:00
John Baldwin
09b9789b28 Remove the wl(4) driver and wlconfig(8) utility.
The wl(4) driver supports pre-802.11 PCCard wireless adapters that
are slower than 802.11b.  They do not work with any of the 802.11
framework and the driver hasn't been reported to actually work in a
long time.

Relnotes:	yes
2016-08-19 22:27:14 +00:00
John Baldwin
c1c9764296 Remove the si(4) driver and sicontrol(8) for Specialix serial cards.
The si(4) driver supported multiport serial adapters for ISA, EISA, and
PCI buses.  This driver does not use bus_space, instead it depends on
direct use of the pointer returned by rman_get_virtual().  It is also
still locked by Giant and calls for patch testing to convert it to use
bus_space were unanswered.

Relnotes:	yes
2016-08-19 21:14:27 +00:00
Bruce Evans
5bd90da0ef Remove duplicate definition of get_pcb_td(). gcc works for detecting
this error.
2016-08-15 10:46:33 +00:00
Bruce Evans
258b53d151 Fix the variables $esp, $ds, $es, $fs, $gs and $ss in vm86 mode.
Fix PC_REGS() so that printing of instructions works in some useful
cases.  ddb only understands a single flat address space, but this
macro allows mapping $cs:$eip into vm86's flat address space well
enough for the MI parts of ddb.  This doesn't work for the MD parts
that do stack traces, and there are no similar macros for data addresses.

PC_REGS() has to use the trapframe pointer instead of the pcb for this.
For other CPUs, the trapframe pointer is not available except by tracing
back to it.  But tracing back through vm86 trapframes is broken even
starting with one.
2016-08-14 16:51:25 +00:00
Konstantin Belousov
e42f8233fc Unconditionally perform checks that FPU region was entered, when #NM
exception is caught in kernel mode.  There are third-party modules
which trigger the issue, and since the problem causes usermode state
corruption at least, panic in production kernels as well.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-10 13:44:03 +00:00
Konstantin Belousov
fa03524a9f Merge i386 and amd64 variants of mp_watchdog.c into x86/, there is no
difference between files.
For pc98, put x86/mp_x86.c into the same place as used by i386 file list.
Fix typo in comment.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-03 13:51:53 +00:00
Brooks Davis
40018b91dd Don't create pointless backups of generated files in "make sysent".
Any sensible workflow will include a revision control system from which
to restore the old files if required.  In normal usage, developers just
have to clean up the mess.

Reviewed by:	jhb
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D7353
2016-07-28 21:29:04 +00:00
Alexander Motin
fb112f72a8 Add more UEFI/e820 memory types from latest specifications.
This is only cosmetics.

MFC after:	2 weeks
2016-07-24 09:15:11 +00:00
John Baldwin
a5c6318fc3 Rename PTRACE_SYSCALL to LINUX_PTRACE_SYSCALL.
Suggested by:	kib
2016-07-16 00:54:46 +00:00
Eric Badger
fdb6320d45 Add explicit detection of KVM hypervisor
Set vm_guest to a new enum value (VM_GUEST_KVM) when kvm is detected and use
vm_guest in conditionals testing for KVM.

Also, fix a conditional checking if we're running in a VM which caught only
the generic VM case, but not more specific VMs (KVM, VMWare, etc.).  (Spotted
by: vangyzen).

Differential revision:	https://reviews.freebsd.org/D7172
Sponsored by:	Dell Inc.
Approved by:	kib (mentor), vangyzen (mentor)
Reviewed by:	alc
MFC after:	4 weeks
2016-07-13 19:19:18 +00:00
Jung-uk Kim
0aed566c32 Remove a tunable and always reset system clock while resuming with ACPI.
Requested by:	bde (long ago)
2016-07-13 19:16:32 +00:00
Roger Pau Monné
302244700f xen: automatically disable MSI-X interrupt migration
If the hypervisor version is smaller than 4.6.0. Xen commits 74fd00 and
70a3cb are required on the hypervisor side for this to be fixed, and those
are only included in 4.6.0, so stay on the safe side and disable MSI-X
interrupt migration on anything older than 4.6.0.

It should not cause major performance degradation unless a lot of MSI-X
interrupts are allocated.

Sponsored by:		Citrix Systems R&D
MFC after:		3 days
Reviewed by:		jhb
Differential revision:	https://reviews.freebsd.org/D7148
2016-07-12 08:43:09 +00:00
Konstantin Belousov
b42dfd6dff Fill tf_trapno for trap frames created for syscall.
If tf_trapno contains garbage which appears to be equal to T_NMI,
e.g. due to thread previously entered kernel due to NMI, doreti
sequence skips ast, and does so until a trap or hardware interrupt
occur.

The visible effects of the issue are quite confusing.  First, signals
delivery is postponed in observable ways.  In particular, the
guarantee that unblocked async signals queue is flushed before a
return from syscall, is broken.  Second, if there are pending signals,
all interruptible sleeps of the stuck thread are aborted immediately.

Since modern CPUs are relatively fast and tickless kernel generates
low interrupt rate, the faulty condition might exist for long time (in
an application time scale).

In collaboration with:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-07-11 15:52:52 +00:00
Dmitry Chagin
97d06da692 Fix a copy/paste bug introduced during X86_64 Linuxulator work.
FreeBSD support NX bit on X86_64 processors out of the box, for i386 emulation
use READ_IMPLIES_EXEC flag, introduced in r302515.

While here move common part of mmap() and mprotect() code to the files in compat/linux
to reduce code dupcliation between Linuxulator's.

Reported by:    Johannes Jost Meixner, Shawn Webb

MFC after:	1 week
XMFC with:	r302515, r302516
2016-07-10 08:22:04 +00:00
Dmitry Chagin
ab231b83ea Regen for r302215 (Linux personality). 2016-07-10 08:17:16 +00:00
Dmitry Chagin
23e8912c60 Implement Linux personality() system call mainly due to READ_IMPLIES_EXEC flag.
In Linux if this flag is set, PROT_READ implies PROT_EXEC for mmap().
Linux/i386 set this flag automatically if the binary requires executable stack.

READ_IMPLIES_EXEC flag will be used in the next Linux mmap() commit.
2016-07-10 08:15:50 +00:00
Nathan Whitehorn
96c85efb4b Replace a number of conflations of mp_ncpus and mp_maxid with either
mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in
the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try,
for example, to run tasks on CPUs that did not exist or to allocate too
few buffers on systems with sparse CPU IDs in which there are holes in the
range and mp_maxid > mp_ncpus. Such circumstances generally occur on
systems with SMT, but on which SMT is disabled. This patch restores system
operation at least on POWER8 systems configured in this way.

There are a number of other places in the kernel with potential problems
in these situations, but where sparse CPU IDs are not currently known
to occur, mostly in the ARM machine-dependent code. These will be fixed
in a follow-up commit after the stable/11 branch.

PR:		kern/210106
Reviewed by:	jhb
Approved by:	re (glebius)
2016-07-06 14:09:49 +00:00
Konstantin Belousov
5c2cf81845 Update comments for the MD functions managing contexts for new
threads, to make it less confusing and using modern kernel terms.

Rename the functions to reflect current use of the functions, instead
of the historic KSE conventions:
  cpu_set_fork_handler -> cpu_fork_kthread_handler (for kthreads)
  cpu_set_upcall -> cpu_copy_thread (for forks)
  cpu_set_upcall_kse -> cpu_set_upcall (for new threads creation)

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (hrs)
Differential revision:	https://reviews.freebsd.org/D6731
2016-06-16 12:05:44 +00:00
Dmitry Chagin
5437e1d103 Add macro to convert errno and use it when appropriate.
MFC after:	1 week
2016-05-22 12:46:34 +00:00
Dmitry Chagin
f26a190f65 Regen after r300359 (struct l_sched_param removal).
MFC after:	1 week
2016-05-21 08:03:13 +00:00
Dmitry Chagin
8cc96fb43a Correct an argument param of linux_sched_* system calls as a struct l_sched_param
does not defined due to it's nature.

MFC after:	1 week
2016-05-21 08:01:14 +00:00
Konstantin Belousov
0bfad8e4a3 Check for overflow and return EINVAL if detected. Backport this and
r300305 to i386.

PR:	209661
Reported and reviewed by:	cturt
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2016-05-20 19:50:32 +00:00
Sepherosa Ziehau
dfdc9a05c6 atomic: Add testandclear on i386/amd64
Reviewed by:	kib
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6381
2016-05-16 07:19:33 +00:00
John Baldwin
8d791e5af1 Add a new bus method to fetch device-specific CPU sets.
bus_get_cpus() returns a specified set of CPUs for a device.  It accepts
an enum for the second parameter that indicates the type of cpuset to
request.  Currently two valus are supported:

 - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
   the device when DEVICE_NUMA is enabled)
 - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL.  INTR_CPUS is mapped to 'all_cpus'
by default.  The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled.  They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Compared to the r298933, this version uses 'struct _cpuset' in
<sys/bus.h> instead of 'cpuset_t' to avoid requiring <sys/param.h>
(<sys/_cpuset.h> still requires <sys/param.h> for MAXCPU even though
<sys/_bitset.h> does not after recent changes).
2016-05-09 20:50:21 +00:00
John Baldwin
82cb5c3b5b Native PCI-express HotPlug support.
PCI-express HotPlug support is implemented via bits in the slot
registers of the PCI-express capability of the downstream port along
with an interrupt that triggers when bits in the slot status register
change.

This is implemented for FreeBSD by adding HotPlug support to the
PCI-PCI bridge driver which attaches to the virtual PCI-PCI bridges
representing downstream ports on HotPlug slots. The PCI-PCI bridge
driver registers an interrupt handler to receive HotPlug events. It
also uses the slot registers to determine the current HotPlug state
and drive an internal HotPlug state machine. For simplicty of
implementation, the PCI-PCI bridge device detaches and deletes the
child PCI device when a card is removed from a slot and creates and
attaches a PCI child device when a card is inserted into the slot.

The PCI-PCI bridge driver provides a bus_child_present which claims
that child devices are present on HotPlug-capable slots only when a
card is inserted. Rather than requiring a timeout in the RC for
config accesses to not-present children, the pcib_read/write_config
methods fail all requests when a card is not present (or not yet
ready).

These changes include support for various optional HotPlug
capabilities such as a power controller, mechanical latch,
electro-mechanical interlock, indicators, and an attention button.
It also includes support for devices which require waiting for
command completion events before initiating a subsequent HotPlug
command. However, it has only been tested on ExpressCard systems
which support surprise removal and have none of these optional
capabilities.

PCI-express HotPlug support is conditional on the PCI_HP option
which is enabled by default on arm64, x86, and powerpc.

Reviewed by:	adrian, imp, vangyzen (older versions)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D6136
2016-05-05 22:26:23 +00:00
Roger Pau Monné
71718d3c73 xen/i386: enable the platform hypercall for i386
Not sure why the platform hypercall was disabled on i386, just enable it in
order to fix compilation of the PV timer on i386.

Sponsored by: Citrix Systems R&D
2016-05-03 08:05:14 +00:00
Konstantin Belousov
e1da986b54 Make it explicit that D_MEM cdevsw d_flag is to signify that the
driver is (or behaves identically to) /dev/mem.  Remove the D_MEM flag
from random drivers.

Note that currently the D_MEM flag does not affect any behaviour, but
this going to change in the next commit.

Noted and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
X-Differential revision:	https://reviews.freebsd.org/D6149
2016-05-01 17:46:56 +00:00
John Baldwin
e131ba36e8 Move 'device pci' for the PCI bus driver to the MI NOTES file.
The PCI bus was already listed in all of the MD NOTES files and the
driver should at least compile on all platforms.
2016-04-29 23:53:55 +00:00
Pedro F. Giffuni
b66bb393f2 Cleanup redundant parenthesis from existing howmany()/roundup() macro uses. 2016-04-22 16:57:42 +00:00
Pedro F. Giffuni
a380994fff Yet more redundant parenthesis from r298431.
Mea culpa.
2016-04-21 20:30:38 +00:00
Pedro F. Giffuni
d9c9c81c08 sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
2016-04-21 19:57:40 +00:00
Pedro F. Giffuni
ea24b0561f X86: use our nitems() macro when it is avaliable through param.h.
No functional change, only trivial cases are done in this sweep,

Discussed in:	freebsd-current
2016-04-19 23:41:46 +00:00
Sepherosa Ziehau
0c29fe6db8 hyperv: Deprecate HYPERV option by moving Hyper-V IDT vector into vmbus
Submitted by:	Jun Su <junsu microsoft com>
Reviewed by:	jhb, kib, sephe
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5910
2016-04-15 02:20:18 +00:00
Pedro F. Giffuni
a3269b0863 x86: for pointers replace 0 with NULL.
These are mostly cosmetical, no functional change.

Found with devel/coccinelle.
2016-04-14 17:04:06 +00:00
John Baldwin
4478441145 Expose doreti as a global symbol on amd64 and i386.
doreti provides the common code path for returning from interrupt
andlers on x86.  Exposing doreti as a global symbol allows kernel
modules to include low-level interrupt handlers instead of requiring
all low-level handlers to be statically compiled into the kernel.

Submitted by:	Howard Su <howard0su@gmail.com>
Reviewed by:	kib
2016-04-13 17:37:31 +00:00
Andriy Gapon
0d63fc3ed8 re-enable AMD Topology extension on certain models if disabled by BIOS
Some BIOSes disable AMD Topology extension on AMD Family 15h notebook
processors.  We re-enable the extension, so that we can properly discover
core and cache topology.  Linux seems to do the same.

Reported by:	Johannes Dieterich <dieterich.joh@gmail.com>
Reviewed by:	jhb, kib
Tested by:	Johannes Dieterich <dieterich.joh@gmail.com>
		(earlier version)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D5883
2016-04-12 13:30:39 +00:00
Baptiste Daroussin
b6348be7b9 Add kern.features flags for linux and linux64 modules
kern.features.linux: 1 meaning linux 32 bits binaries are supported
kern.features.linux64: 1 meaning linux 64 bits binaries are supported

The goal here is to help 3rd party applications (including ports) to determine
if the host do support linux emulation

Reviewed by:	dchagin
MFC after:	1 week
Relnotes:	yes
Differential Revision:	D5830
2016-04-05 22:36:48 +00:00
John Baldwin
2b1e924b69 Move i386/i386/autoconf.c to sys/x86/x86 and use it on both amd64 and i386. 2016-04-03 23:03:54 +00:00
Konstantin Belousov
0df87548b9 Type of the interrupt handlers on x86 cannot be expressed in C.
Simplify and unify placeholder type definitions.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D5771
2016-03-29 19:56:48 +00:00
Dmitry Chagin
7c5982000d Revert r297310 as the SOL_XXX are equal to the IPPROTO_XX except SOL_SOCKET.
Pointed out by:	ae@
2016-03-27 10:09:10 +00:00