Commit Graph

5922 Commits

Author SHA1 Message Date
Attilio Rao
68b739cd6f Add the possibility to specify from kernel configs MAXCPU value.
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.

No MFC is previewed for this patch.

Tested by:	pluknet
Approved by:	re (kib)
2011-07-19 00:37:24 +00:00
Attilio Rao
521ea19d1c - Remove the eintrcnt/eintrnames usage and introduce the concept of
sintrcnt/sintrnames which are symbols containing the size of the 2
  tables.
- For amd64/i386 remove the storage of intr* stuff from assembly files.
  This area can be widely improved by applying the same to other
  architectures and likely finding an unified approach among them and
  move the whole code to be MI. More work in this area is expected to
  happen fairly soon.

No MFC is previewed for this patch.

Tested by:	pluknet
Reviewed by:	jhb
Approved by:	re (kib)
2011-07-18 15:19:40 +00:00
Jung-uk Kim
f0b28f005e Correct cpu_monitor() and cpu_mwait() for amd64. These instructions take
%rcx as "extensions" in long mode.  If any unused bit is set in %rcx, these
instructions cause general protection fault.  Fix style nits and synchronize
i386 with amd64.
2011-07-05 18:42:10 +00:00
Attilio Rao
470107b2f1 MFC 2011-07-04 11:13:00 +00:00
Alan Cox
80788b2a27 When iterating over a paging queue, explicitly check for PG_MARKER, instead
of relying on zeroed memory being interpreted as an empty PV list.

Reviewed by:	kib
2011-07-02 23:42:04 +00:00
Jonathan Anderson
12bc222e57 Add some checks to ensure that Capsicum is behaving correctly, and add some
more explicit comments about what's going on and what future maintainers
need to do when e.g. adding a new operation to a sys_machdep.c.

Approved by: mentor(rwatson), re(bz)
2011-06-30 10:56:02 +00:00
Attilio Rao
7b744f6b01 MFC 2011-06-30 10:19:43 +00:00
Alan Cox
6bbee8e28a Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this
option to vm_object_page_remove() asserts that the specified range of pages
is not mapped, or more precisely that none of these pages have any managed
mappings.  Thus, vm_object_page_remove() need not call pmap_remove_all() on
the pages.

This change not only saves time by eliminating pointless calls to
pmap_remove_all(), but it also eliminates an inconsistency in the use of
pmap_remove_all() versus related functions, like pmap_remove_write().  It
eliminates harmless but pointless calls to pmap_remove_all() that were being
performed on PG_UNMANAGED pages.

Update all of the existing assertions on pmap_remove_all() to reflect this
change.

Reviewed by:	kib
2011-06-29 16:40:41 +00:00
Jonathan Anderson
24c1c3bf71 We may split today's CAPABILITIES into CAPABILITY_MODE (which has
to do with global namespaces) and CAPABILITIES (which has to do with
constraining file descriptors). Just in case, and because it's a better
name anyway, let's move CAPABILITIES out of the way.

Also, change opt_capabilities.h to opt_capsicum.h; for now, this will
only hold CAPABILITY_MODE, but it will probably also hold the new
CAPABILITIES (implying constrained file descriptors) in the future.

Approved by: rwatson
Sponsored by: Google UK Ltd
2011-06-29 13:03:05 +00:00
Attilio Rao
6b6603b30e Remove the pc_cpumask usage from amd64.
Reviewed by:	alc
Tested by:	pluknet
2011-06-26 21:36:53 +00:00
Attilio Rao
de138ec703 MFC 2011-06-24 16:35:40 +00:00
John Baldwin
1368987ae4 Move {amd64,i386}/pci/pci_bus.c and {amd64,i386}/include/pci_cfgreg.h to
the x86 tree.  The $PIR code is still only enabled on i386 and not amd64.
While here, make the qpi(4) driver on conditional on 'device pci'.
2011-06-22 21:04:13 +00:00
Attilio Rao
9b571ec6b3 MFC 2011-06-22 19:42:32 +00:00
John Baldwin
e8f40e32eb Oops, missed these in 223424.
Reported by:	jkim
2011-06-22 18:48:07 +00:00
John Baldwin
3bf59bd14f Use uintXX_t instead of u_intXX_t. 2011-06-22 17:55:16 +00:00
John Baldwin
38d7a61ba4 Add a helper routine to conditionally modify the start address of a
resource allocation from an x86 Host-PCI bridge driver so that it can be
reused by the ACPI Host-PCI bridge driver (and eventually the MPTable
Host-PCI bridge driver) instead of duplicating the same logic.  Note that
this means that hw.acpi.host_mem_start is now replaced with the
hw.pci.host_mem_start tunable that was already used in the non-ACPI case.
This also removes hw.acpi.host_mem_start on ia64 where it was not
applicable (the implementation was very x86-specific).

While here, adjust the logic to apply the new start address on any
"wildcard" allocation even if that allocation comes from a subset of
the allowable address range.

Reviewed by:	imp (1)
2011-06-22 16:15:15 +00:00
Attilio Rao
2e4cefa632 Remove the usage of pc_other_cpus from amd64.
Tested by:	pluknet
2011-06-21 09:19:38 +00:00
Konstantin Belousov
1c23d0f727 Fix vfork. Add comments. 2011-06-18 12:13:28 +00:00
Hans Petter Selasky
144b716627 Enable USB 3.0 support by default in i386 and amd64 GENERIC kernels.
Discussed with:	joel @ and thompsa @
MFC after:	7 days
2011-06-14 20:30:49 +00:00
Joel Dahl
701b698b6f Enable sound support by default on i386 and amd64.
The generic sound driver has been added, along with enough
device-specific drivers to support the most common audio
chipsets.

We've discussed enabling it from time to time over the years
and we've received numerous requests from users, so we decided
that shipping 9.0 with working audio by default would be the
best thing to do.

Bug reports should be sent to the multimedia@ mailing list, as
usual.

Approved by:    mav
No objection:   re
2011-06-11 09:08:46 +00:00
John Baldwin
049dc0d1ff Implement BUS_ADJUST_RESOURCE() for the x86 drivers that sit between the
Host-PCI bridge drivers and nexus.
2011-06-10 12:30:16 +00:00
Andriy Gapon
234dab4a82 remove code for dynamic offlining/onlining of CPUs on x86
The code has definitely been broken for SCHED_ULE, which is a default
scheduler.  It may have been broken for SCHED_4BSD in more subtle ways,
e.g. with manually configured CPU affinities and for interrupt devilery
purposes.
We still provide a way to disable individual CPUs or all hyperthreading
"twin" CPUs before SMP startup.  See the UPDATING entry for details.

Interaction between building CPU topology and disabling CPUs still
remains fuzzy: topology is first built using all availble CPUs and then
the disabled CPUs should be "subtracted" from it.  That doesn't work
well if the resulting topology becomes non-uniform.

This work is done in cooperation with Attilio Rao who in addition to
reviewing also provided parts of code.

PR:		kern/145385
Discussed with:	gcooper, ambrisko, mdf, sbruno
Reviewed by:	attilio
Tested by:	pho, pluknet
X-MFC after:	never
2011-06-08 08:12:15 +00:00
Attilio Rao
74e4245e3f Bring back the number of CPU to 32. 2011-06-07 08:05:23 +00:00
Attilio Rao
81c02539f1 MFC 2011-06-06 21:38:39 +00:00
Andriy Gapon
ecee337a8c don't use cpuid level 4 in x86 cpu topology detection if it's not supported
This regression was introduced in r213323.
There are probably no Intel cpus that support amd64 mode, but do not
support cpuid level 4, but it's better to keep i386 and amd64 versions
of this code in sync.

Discovered by:	pho
Tested by:	pho
MFC after:	2 weeks
2011-06-06 14:23:13 +00:00
Kevin Lo
a92e80be3f Bring back r222275. runfw(4) will statically link in rt2870.fw.uu
to the kernel, though I have MODULES_OVERRIDE="" in GENERIC.

Spotted by:	thompsa
2011-05-25 10:04:13 +00:00
Kevin Lo
6d5ee6cd7f run(4) needs firmware loaded to work 2011-05-25 04:46:48 +00:00
Attilio Rao
d955f0fccf Revert a patch that involountary sneaked in while I was MFCing. 2011-05-23 23:51:01 +00:00
Attilio Rao
a9ff18a210 MFC 2011-05-23 01:17:30 +00:00
Attilio Rao
5f6b159db7 MFC 2011-05-18 16:01:29 +00:00
Jung-uk Kim
2b052e43be Update CPUID bits to reflect AMD Bulldozer and Intel Sandy Bridge features.
Note AMD dropped SSE5 extensions in order to avoid ISA overlap with Intel
AVX instructions.  The SSE5 bit was recycled as XOP extended instruction
bit, CVT16 was deprecated in favor of F16C (half-precision float conversion
instructions for AVX), and the remaining FMA4 (4-operand FMA instructions)
gained a separate CPUID bit.  Replace non-existent references with today's
CPUID specifications.
2011-05-17 22:36:16 +00:00
Attilio Rao
b2aa562e7b MFC 2011-05-13 20:58:48 +00:00
Matthew D Fleming
cfb00e5aa7 Move the ZERO_REGION_SIZE to a machine-dependent file, as on many
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk.  Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.

Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file.  (Darn the unfamiliar development
environment).

Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.

Requested by:	alc
MFC after:	1 week
MFC with:	r221853
2011-05-13 19:35:01 +00:00
Attilio Rao
ef607a6aa3 MFC 2011-05-12 14:01:40 +00:00
Dmitry Chagin
98cde5eede Remove wrong comment.
MFC after:	1 week.
2011-05-11 17:57:15 +00:00
Jung-uk Kim
00c885e181 Add SC_PIXEL_MODE to GENERIC for amd64 and i386.
Requested by:	many
2011-05-10 16:44:16 +00:00
Attilio Rao
bd55ede060 MFC 2011-05-09 18:53:13 +00:00
Jung-uk Kim
65e7d70b09 Implement boot-time TSC synchronization test for SMP. This test is executed
when the user has indicated that the system has synchronized TSCs or it has
P-state invariant TSCs.  For the former case, we may clear the tunable if it
fails the test to prevent accidental foot-shooting.  For the latter case, we
may set it if it passes the test to notify the user that it may be usable.
2011-05-09 17:34:00 +00:00
Attilio Rao
aa8b9e0706 MFC 2011-05-06 22:45:33 +00:00
Andriy Gapon
fdf30d59a6 prepare code that does topology detection for amd cpus for bulldozer
This also introduces a new detection path for family 10h and newer
pre-bulldozer cpus, pre-10h hardware should not be affected.

Tested by:	Gary Jennejohn <gljennjohn@googlemail.com>
		(with pre-10h hardware)
MFC after:	2 weeks
2011-05-06 13:51:54 +00:00
Attilio Rao
71a19bdc64 Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).

Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.

The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN

while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.

Some technical notes:
- This commit may be considered an ABI nop for all the architectures
  different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
  accessed avoiding migration, because the size of cpuset_t should be
  considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
  primirally done in order to leave some more space in userland to cope
  with KBI extensions). If you need to access kernel cpuset_t from the
  userland please refer to example in this patch on how to do that
  correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now

The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.

Tested by:	pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by:	jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
Attilio Rao
8c0ef2464e Revert md_assert_preempt() introduction.
Discussed with:	jeff, jhb
2011-05-04 20:29:40 +00:00
Attilio Rao
94ebcddde3 MFC 2011-05-03 18:57:46 +00:00
John Baldwin
6162795be0 Enable the new PCI-PCI bridge driver on amd64 and i386 by default. It can
be disabled via 'nooptions NEW_PCIB'.
2011-05-03 18:23:11 +00:00
John Baldwin
83c41143ca Reimplement how PCI-PCI bridges manage their I/O windows. Previously the
driver would verify that requests for child devices were confined to any
existing I/O windows, but the driver relied on the firmware to initialize
the windows and would never grow the windows for new requests.  Now the
driver actively manages the I/O windows.

This is implemented by allocating a bus resource for each I/O window from
the parent PCI bus and suballocating that resource to child devices.  The
suballocations are managed by creating an rman for each I/O window.  The
suballocated resources are mapped by passing the bus_activate_resource()
call up to the parent PCI bus.  Windows are grown when needed by using
bus_adjust_resource() to adjust the resource allocated from the parent PCI
bus.  If the adjust request succeeds, the window is adjusted and the
suballocation request for the child device is retried.

When growing a window, the rman_first_free_region() and
rman_last_free_region() routines are used to determine if the front or
end of the existing I/O window is free.  From using that, the smallest
ranges that need to be added to either the front or back of the window
are computed.  The driver will first try to grow the window in whichever
direction requires the smallest growth first followed by the other
direction if that fails.

Subtractive bridges will first attempt to satisfy requests for child
resources from I/O windows (including attempts to grow the windows).  If
that fails, the request is passed up to the parent PCI bus directly
however.

The PCI-PCI bridge driver will try to use firmware-assigned ranges for
child BARs first and only allocate a "fresh" range if that specific range
cannot be accommodated in the I/O window.  This allows systems where the
firmware assigns resources during boot but later wipes the I/O windows
(some ACPI BIOSen are known to do this) to "rediscover" the original I/O
window ranges.

The ACPI Host-PCI bridge driver has been adjusted to correctly honor
hw.acpi.host_mem_start and the I/O port equivalent when a PCI-PCI bridge
makes a wildcard request for an I/O window range.

The new PCI-PCI bridge driver is only enabled if the NEW_PCIB kernel option
is enabled.  This is a transition aide to allow platforms that do not
yet support bus_activate_resource() and bus_adjust_resource() in their
Host-PCI bridge drivers (and possibly other drivers as needed) to use the
old driver for now.  Once all platforms support the new driver, the
kernel option and old driver will be removed.

PR:		kern/143874 kern/149306
Tested by:	mav
2011-05-03 17:37:24 +00:00
Attilio Rao
7be8a2de4f MFC @ r221324 2011-05-02 14:23:36 +00:00
John Baldwin
d2c9344ff9 Add implementations of BUS_ADJUST_RESOURCE() to the PCI bus driver,
generic PCI-PCI bridge driver, x86 nexus driver, and x86 Host to PCI bridge
drivers.
2011-05-02 14:13:12 +00:00
Bernhard Schmidt
d1f25d5dcb Add the remaining wireless drivers.
Discussed with:	joel
2011-05-01 13:26:34 +00:00
Attilio Rao
f1edea81ac Add the function md_assert_nopreempt(), which is a very consistent
function on the possibility of a thread to not preempt.

As this function is very tied to x86 (interrupts disabled checkings)
it is not intended to be used in MI code.
2011-04-30 23:12:37 +00:00
Kevin Lo
5aaea65247 Add urtw(4) 2011-04-29 06:36:39 +00:00
Jung-uk Kim
c34e9dbee1 Define "Hypervisor Present" bit. This bit is used by several hypervisors to
identify CPUs running under emulation.  Currently QEMU-KVM, Xen-HVM, VMware,
and MS Hyper-V are known to set this bit.

MFC after:	3 days
2011-04-28 22:23:39 +00:00
Attilio Rao
2be767e069 Add the watchdogs patting during the (shutdown time) disk syncing and
disk dumping.
With the option SW_WATCHDOG on, these operations are doomed to let
watchdog fire, fi they take too long.

I implemented the stubs this way because I really want wdog_kern_*
KPI to not be dependant by SW_WATCHDOG being on (and really, the option
only enables watchdog activation in hardclock) and also avoid to
call them when not necessary (avoiding not-volountary watchdog
activations).

Sponsored by:	Sandvine Incorporated
Discussed with:	emaste, des
MFC after:	2 weeks
2011-04-28 16:02:05 +00:00
Rick Macklem
4309e17add This patch changes head so that the default NFS client is now the new
NFS client (which I guess is no longer experimental). The fstype "newnfs"
is now "nfs" and the regular/old NFS client is now fstype "oldnfs".
Although mounts via fstype "nfs" will usually work without userland
changes, an updated mount_nfs(8) binary is needed for kernels built with
"options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and
mount(8) binaries are needed to do mounts for fstype "oldnfs".
The GENERIC kernel configs have been changed to use options
NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER.
For kernels being used on diskless NFS root systems, "options NFSCL"
must be in the kernel config.
Discussed on freebsd-fs@.
2011-04-27 17:51:51 +00:00
Alexander Motin
0d307e0905 - Add shim to simplify migration to the CAM-based ATA. For each new adaX
device in /dev/ create symbolic link with adY name, trying to mimic old ATA
numbering. Imitation is not complete, but should be enough in most cases to
mount file systems without touching /etc/fstab.
 - To know what behavior to mimic, restore ATA_STATIC_ID option in cases
where it was present before.
 - Add some more details to UPDATING.
2011-04-26 17:01:49 +00:00
Maxim Sobolev
f30bc1f3b5 With the typical memory size of the system in tenth of gigabytes
counting memory being dumped in 16MB increments is somewhat silly.
Especially if the dump fails and everything you've got for debugging
is screen filled with numbers in 16 decrements... Replace that with
percentage-based progress with max 10 updates all fitting into one
line.

Collapse other very "useful" piece of crash information (total ram) into
the same line to save some more space.

MFC after:	1 week
2011-04-26 16:14:55 +00:00
Rick Macklem
7c208ed659 Fix the experimental NFS client so that it does not bogusly
set the f_flags field of "struct statfs". This had the interesting
effect of making the NFSv4 mounts "disappear" after r221014,
since NFSMNT_NFSV4 and MNT_IGNORE became the same bit.
Move the files used for a diskless NFS root from sys/nfsclient
to sys/nfs in preparation for them to be used by both NFS
clients. Also, move the declaration of the three global data
structures from sys/nfsclient/nfs_vfsops.c to sys/nfs/nfs_diskless.c
so that they are defined when either client uses them.

Reviewed by:	jhb
MFC after:	2 weeks
2011-04-25 22:22:51 +00:00
Alexander Motin
97b53e3634 Switch the GENERIC kernels for all architectures to the new CAM-based ATA
stack. It means that all legacy ATA drivers are disabled and replaced by
respective CAM drivers. If you are using ATA device names in /etc/fstab or
other places, make sure to update them respectively (adX -> adaY,
acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential
numbers for each type in order of detection, unless configured otherwise
with tunables, see cam(4)).

ataraid(4) functionality is now supported by the RAID GEOM class.
To use it you can load geom_raid kernel module and use graid(8) tool
for management. Instead of /dev/arX device names, use /dev/raid/rX.
2011-04-24 08:58:58 +00:00
Konstantin Belousov
3136faa59d Make pmap_invalidate_cache_range() available for consumption on amd64.
Add pmap_invalidate_cache_pages() method on x86. It flushes the CPU
cache for the set of pages, which are not neccessary mapped. Since its
supposed use is to prepare the move of the pages ownership to a device
that does not snoop all CPU accesses to the main memory (read GPU in
GMCH), do not rely on CPU self-snoop feature.

amd64 implementation takes advantage of the direct map. On i386,
extract the helper pmap_flush_page() from pmap_page_set_memattr(), and
use it to make a temporary mapping of the flushed page.

Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2011-04-18 21:24:42 +00:00
Jung-uk Kim
0e72764232 Add a function rdtsc32() to read lower 32 bits from TSC and discard upper
32 bits.  Some times compiler inserts unnecessary instructions to preserve
unused upper 32 bits even when it is casted to a 32-bit value.  It reduces
such compiler mistakes where every cycle counts.
2011-04-14 16:53:32 +00:00
Jung-uk Kim
4854ae249c Consistently use __volatile as the rest of this file. 2011-04-14 16:19:41 +00:00
Jung-uk Kim
f5ac47f44c Prefer C99 standard integers to reduce diff from i386 version. 2011-04-14 16:14:35 +00:00
Jung-uk Kim
a7817c7ae5 Reduce errors in effective frequency calculation. 2011-04-12 23:49:07 +00:00
Jung-uk Kim
b9e4376214 Reinstate cpu_est_clockrate() support for P-state invariant TSC if APERF and
MPERF MSRs are available.  It was disabled in r216443.  Remove the earlier
hack to subtract 0.5% from the calibrated frequency as DELAY(9) is little
bit more reliable now.
2011-04-12 23:04:01 +00:00
Jung-uk Kim
dd3e254ebd Add forgotten declarations for tsc_perf_stat from the previous commit. 2011-04-12 22:22:01 +00:00
Jung-uk Kim
155094d77a Probe capability to find effective frequency. When the TSC is P-state
invariant, APERF/MPERF ratio can be used to find effective frequency.
2011-04-12 22:15:46 +00:00
Jung-uk Kim
3731174954 Add definitions for CPUID instruction 6, ECX information. 2011-04-12 22:12:23 +00:00
Konstantin Belousov
2140d9c83b Remove setting of PCB_FULL_IRET at the places where we are going to call
update_gdt_{f,g}sbase. The functions set the flag when td == curthread,
and sysarch is always called with curthread.

Reviewed by:	jhb, jkim
MFC after:	1 week
2011-04-08 21:27:31 +00:00
Konstantin Belousov
5ab73cbcba Disable local interrupts before testing the PCB_FULL_IRET flag.
Thread might be preempted after testing, which causes the flag to be
cleared. If ast was not delivered, we will do sysret with potentially
wrong fs/gs bases.

Reviewed by:	jhb, jkim
MFC after:	1 week (together with r220430, r220452)
2011-04-08 21:26:50 +00:00
Ryan Stone
7d6a0bf373 Add tunables that mirror the functionality of sysctls machdep.panic_on_nmi
and machdep.kdb_on_nmi.

Approved by:	emaste (mentor)
MFC after:	1 week
2011-04-08 14:39:41 +00:00
John Baldwin
13fb631aff Fix a bug in the previous change to restore the fast path for syscall
return.  The ast() function may cause a context switch in which case
PCB_FULL_IRET would be set in the pcb.  However, the code was not
rechecking the flag after ast() returned and would not properly restore
the FSBASE and GSBASE MSRs.  To fix, recheck the PCB_FULL_IRET flag after
ast() returns.

While here, trim an instruction (and memory access) from the doreti path
and fix a typo in a comment.

MFC after:	1 week
2011-04-08 13:33:57 +00:00
John Baldwin
ff265077cf Catch up to PCB_FULL_IRET becoming a pcb flag rather than a full field.
MFC after:	3 days
2011-04-08 13:30:48 +00:00
Jung-uk Kim
3453537fa5 Use atomic load & store for TSC frequency. It may be overkill for amd64 but
safer for i386 because it can be easily over 4 GHz now.  More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly).  Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).
2011-04-07 23:28:28 +00:00
John Baldwin
615d2dffa3 pcb_flags is an int, so use testl rather than testq.
Pointy hat to:	jhb
Submitted by:	jkim
MFC after:	1 week
2011-04-07 23:13:22 +00:00
John Baldwin
1438b4ced1 If a system call does not request a full interrupt return, use a fast
path via the sysretq instruction to return from the system call.  This was
removed in 190620 and not quite fully restored in 195486.  This resolves
most of the performance regression in system call microbenchmarks between
7 and 8 on amd64.

Reviewed by:	kib
MFC after:	1 week
2011-04-07 21:32:25 +00:00
Jung-uk Kim
efd393d539 Remove stale checks for RDTSC support. amd64 must have TSC support anyway. 2011-04-07 21:29:34 +00:00
Konstantin Belousov
7332c129e0 Add support for executing the FreeBSD 1/i386 a.out binaries on amd64.
In particular:
- implement compat shims for old stat(2) variants and ogetdirentries(2);
- implement delivery of signals with ancient stack frame layout and
  corresponding sigreturn(2);
- implement old getpagesize(2);
- provide a user-mode trampoline and LDT call gate for lcall $7,$0;
- port a.out image activator and connect it to the build as a module
  on amd64.

The changes are hidden under COMPAT_43.

MFC after:   1 month
2011-04-01 11:16:29 +00:00
Andriy Gapon
a930718af1 Revert r220032:linux compat: add SO_PASSCRED option with basic handling
I have not properly thought through the commit.  After r220031 (linux
compat: improve and fix sendmsg/recvmsg compatibility) the basic
handling for SO_PASSCRED is not sufficient as it breaks recvmsg
functionality for SCM_CREDS messages because now we would need to handle
sockcred data in addition to cmsgcred.  And that is not implemented yet.

Pointyhat to:	avg
2011-03-31 08:14:51 +00:00
Adrian Chadd
dba9c85977 Break out the ath PCI logic into a separate device/module.
Introduce the AHB glue for Atheros embedded systems. Right now it's
hard-coded for the AR9130 chip whose support isn't yet in this HAL;
it'll be added in a subsequent commit.

Kernel configuration files now need both 'ath' and 'ath_pci' devices; both
modules need to be loaded for the ath device to work.
2011-03-31 08:07:13 +00:00
Edward Tomasz Napierala
72bcfc9693 Revert part of r220137, committed by mistake - RACCT is _not_ supposed
to be enabled in GENERIC.
2011-03-29 18:16:49 +00:00
Edward Tomasz Napierala
097055e26d Add racct. It's an API to keep per-process, per-jail, per-loginclass
and per-loginclass resource accounting information, to be used by the new
resource limits code.  It's connected to the build, but the code that
actually calls the new functions will come later.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-03-29 17:47:25 +00:00
Alan Cox
1c675a3bc3 The new binutils has correctly redefined MAXPAGESIZE on amd64 as 0x200000
instead of 0x100000.  As a side effect, an amd64 kernel now loads at
physical address 0x200000 instead of 0x100000.  This is probably for the
best because it avoids the use of a 2MB page mapping for the first 1MB of
the kernel that also spans the fixed MTRRs.  However, getmemsize() still
thinks that the kernel loads at 0x100000, and so the physical memory between
0x100000 and 0x200000 is lost.  Fix this problem by replacing the hard-wired
constant in getmemsize() by a symbol "kernphys" that is defined by the
linker script.

In collaboration with:	kib
2011-03-28 06:35:17 +00:00
Alan Cox
63740078e0 Amd64 doesn't have a lazypmap ipi. 2011-03-27 16:18:51 +00:00
Andriy Gapon
01a9e1a11b linux compat: add SO_PASSCRED option with basic handling
This seems to have been a part of a bigger patch by dchagin that either
haven't been committed or committed partially.

Submitted by:	dchagin, nox
MFC after:	2 weeks
2011-03-26 11:25:36 +00:00
Andriy Gapon
931f0826ea linux compat: add non-dummy capget and capset system calls, regenerate
And drop dummy definitions for those system calls.
This may transiently break the build.

PR:		kern/149168
Submitted by:	John Wehle <john@feith.com>
Reviewed by:	netchild
MFC after:	2 weeks
2011-03-26 10:59:24 +00:00
Andriy Gapon
1f4ec5a3ba linux compat: add non-dummy capget and capset system calls
PR:		kern/149168
Submitted by:	John Wehle <john@feith.com>
Reviewed by:	netchild
MFC after:	2 weeks
2011-03-26 10:51:56 +00:00
Dmitry Chagin
acface683e Export the correct AT_PLATFORM value.
Since signal trampolines are copied to the shared page do not need to
leave place on the stack for it. Forgotten in the previous commit.

MFC after:	1 Week
2011-03-26 09:25:35 +00:00
Alan Cox
1587dfd730 Move an external declaration to the appropriate header file. 2011-03-26 06:21:05 +00:00
Jung-uk Kim
cd45fec044 Improve CPU identifications of various IDT/Centaur/VIA, Rise and Transmeta
CPUs.  These CPUs need explicit MSR configuration to expose ceratin CPU
capabilities (e.g., CMPXCHG8B) to work around compatibility issues with
ancient software.  Unfortunately, Rise mP6 does not set the CX8 bit in CPUID
and there is no MSR to expose the feature although all mP6 processors are
capable of CMPXCHG8B according to datasheets I found from the Net.  Clean up
and simplify VIA PadLock detection while I am in the neighborhood.
2011-03-26 02:02:07 +00:00
Jeff Roberson
e4cd31dd3c - Merge changes to the base system to support OFED. These include
a wider arg2 for sysctl, updates to vlan code, IFT_INFINIBAND,
   and other miscellaneous small features.
2011-03-21 09:40:01 +00:00
Bjoern A. Zeeb
d2b74735b8 For now remove options FLOWTABLE from the remaining GENERIC kernel
configurations and make it opt-in for those who want it.  LINT will
still build it.

While it may be a perfect win in some scenarios, it still troubles users
(see PRs) in general cases.  In addition we are still allocating resources
even if disabled by sysctl and still leak arp/nd6 entries in case of
interface destruction.

Discussed with:	qingli (2010-11-24, just never executed)
Discussed with: juli (OCTEON1)
PR:		kern/148018, kern/155604, kern/144917, kern/146792
MFC after:	2 weeks
2011-03-19 15:50:34 +00:00
Jung-uk Kim
38b8542ca9 Deprecate tsc_present as the last of its real consumers finally disappeared. 2011-03-15 17:19:52 +00:00
David Christensen
dd46ab31de - Initial release of bxe(4) to support Broadcom NetXtreme II 10GbE.
(BCM57710, BCM57711, BCM57711E)

MFC after:	One month
2011-03-14 22:42:41 +00:00
Dmitry Chagin
8f1e49a638 Enable shared page use for amd64/linux32 and i386/linux binaries.
Move signal trampoline code from the top of the stack to the shared page.

MFC after:	2 Weeks
2011-03-13 14:58:02 +00:00
Andriy Gapon
d549ef5638 add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls
Regenerate system call and systrace support files.

PR:		kern/152822
Submitted by:	Artem Belevich <fbsdlist@src.cx>
Reviewed by:	jhb (earlier version)
MFC after:	3 weeks
2011-03-12 08:58:19 +00:00
Andriy Gapon
56ede1074e add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls
This commits makes necessary changes in syscall/sysent generation
infrastructure.

PR:		kern/152822
Submitted by:	Artem Belevich <fbsdlist@src.cx>
Reviewed by:	jhb (ealier version)
MFC after:	3 weeks
2011-03-12 08:51:43 +00:00
Andriy Gapon
136882cf92 amd64/NOTES: use a greater number in KSTACK_PAGES example
This is a minor cosmetic change - the users are more likely to want to
increase (rather than decrease) default kernel stack size,
which is already 4 pages on amd64.

MFC after:	4 days
2011-03-11 19:21:42 +00:00
Matthew D Fleming
c77715ef6c Mostly revert r219468, as I had misremembered the C standard regarding
the size of an extern array.

Keep one change from strncpy to strlcpy.
2011-03-11 18:56:55 +00:00
Jung-uk Kim
79422085d4 Add a tunable "machdep.disable_tsc" to turn off TSC. Specifically, it turns
off boot-time CPU frequency calibration, DELAY(9) with TSC, and using TSC as
a CPU ticker.  Note tsc_present does not change by this tunable.
2011-03-11 00:44:32 +00:00
Matthew D Fleming
cd67ac41ae Use MAXPATHLEN rather than the size of an extern array when copying the
kernel name.  Also consistenly use strlcpy().

Suggested by:	Warner Losh
2011-03-10 22:56:00 +00:00
Jung-uk Kim
bc34c87e81 Deprecate rarely used tsc_is_broken. Instead, we zero out tsc_freq because
it is almost always used with tsc_freq any way.
2011-03-10 20:02:58 +00:00