Commit Graph

6069 Commits

Author SHA1 Message Date
kib
a9d505a22a Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic
flags field. Updates to the atomic flags are performed using the atomic
ops on the containing word, do not require any vm lock to be held, and
are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9)
functions are provided to modify afalgs.

Document the changes to flags field to only require the page lock.

Introduce vm_page_reference(9) function to provide a stable KPI and
KBI for filesystems like tmpfs and zfs which need to mark a page as
referenced.

Reviewed by:    alc, attilio
Tested by:      marius, flo (sparc64); andreast (powerpc, powerpc64)
Approved by:	re (bz)
2011-09-06 10:30:11 +00:00
jhb
dd0f1f95e2 Enable the puc(4) driver on amd64 and i386 in GENERIC. This allows
devices supported by puc(4) to work "out of the box" since puc.ko does
not work "out of the box".

Reviewed by:	marcel
Approved by:	re (kib)
MFC after:	1 week
2011-08-26 21:22:34 +00:00
jhb
c189a8585e Make NKPT a kernel option on amd64 so that it can be set to a non-default
value from kernel config files.

Reviewed by:	alc
Approved by:	re (kib)
MFC after:	1 week
2011-08-26 17:08:22 +00:00
bz
422bc41b31 In HEAD when doing no further checkes there is no reason use the
temporary variable and check with if as TUNABLE_*_FETCH do not
alter values unless successfully found the tunable.

Reported by:	jhb, bde
MFC after:	3 days
X-MFC with:	r224516
Approved by:	re (kib)
2011-08-20 19:21:46 +00:00
rwatson
4af919b491 Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *.  With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by:	re (bz)
Submitted by:	jonathan
Sponsored by:	Google Inc
2011-08-11 12:30:23 +00:00
kib
f408aa11a3 - Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag
to VPO_UNMANAGED (and also making the flag protected by the vm object
  lock, instead of vm page queue lock).
- Mark the fake pages with both PG_FICTITIOUS (as it is now) and
  VPO_UNMANAGED. As a consequence, pmap code now can use use just
  VPO_UNMANAGED to decide whether the page is unmanaged.

Reviewed by:	alc
Tested by:	pho (x86, previous version), marius (sparc64),
    marcel (arm, ia64, powerpc), ray (mips)
Sponsored by:	The FreeBSD Foundation
Approved by:	re (bz)
2011-08-09 21:01:36 +00:00
rmacklem
3b8ed22952 Change all the sample kernel configurations to use
NFSCL, NFSD instead of NFSCLIENT, NFSSERVER since
NFSCL and NFSD are now the defaults. The client change is
needed for diskless configurations, so that the root
mount works for fstype nfs.
Reported by seanbru at yahoo-inc.com for i386/XEN.

Approved by:	re (hrs)
2011-08-07 20:16:46 +00:00
bz
632886e187 Introduce a tunable to disable the time consuming parts of bootup
memtesting, which can easily save seconds to minutes of boot time.
The tunable name is kept general to allow reusing the code in
alternate frameworks.

Requested by:	many
Discussed on:	arch (a while a go)
Obtained from:	Sandvine Incorporated
Reviewed by:	sbruno
Approved by:	re (kib)
MFC after:	2 weeks
2011-07-30 13:33:05 +00:00
attilio
30d87a57de Bump MAXCPU for amd64, ia64 and XLP mips appropriately.
From now on, default values for FreeBSD will be 64 maxiumum supported
CPUs on amd64 and ia64 and 128 for XLP. All the other architectures
seem already capped appropriately (with the exception of sparc64 which
needs further support on jalapeno flavour).

Bump __FreeBSD_version in order to reflect KBI/KPI brekage introduced
during the infrastructure cleanup for supporting MAXCPU > 32. This
covers cpumask_t retiral too.

The switch is considered completed at the present time, so for whatever
bug you may experience that is reconducible to that area, please report
immediately.

Requested by:	marcel, jchandra
Tested by:	pluknet, sbruno
Approved by:	re (kib)
2011-07-19 13:00:30 +00:00
attilio
a73e834ebb Add the possibility to specify from kernel configs MAXCPU value.
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.

No MFC is previewed for this patch.

Tested by:	pluknet
Approved by:	re (kib)
2011-07-19 00:37:24 +00:00
attilio
9a6ff5ad37 - Remove the eintrcnt/eintrnames usage and introduce the concept of
sintrcnt/sintrnames which are symbols containing the size of the 2
  tables.
- For amd64/i386 remove the storage of intr* stuff from assembly files.
  This area can be widely improved by applying the same to other
  architectures and likely finding an unified approach among them and
  move the whole code to be MI. More work in this area is expected to
  happen fairly soon.

No MFC is previewed for this patch.

Tested by:	pluknet
Reviewed by:	jhb
Approved by:	re (kib)
2011-07-18 15:19:40 +00:00
jkim
52539f62b4 Correct cpu_monitor() and cpu_mwait() for amd64. These instructions take
%rcx as "extensions" in long mode.  If any unused bit is set in %rcx, these
instructions cause general protection fault.  Fix style nits and synchronize
i386 with amd64.
2011-07-05 18:42:10 +00:00
attilio
364d0522f7 With retirement of cpumask_t and usage of cpuset_t for representing a
mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient.

Remove them and replace their usage with custom pc_cpuid magic (as,
atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and
pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))).

This change is not targeted for MFC because of struct pcpu members
removal and dependency by cpumask_t retirement.

MD review by:	marcel, marius, alc
Tested by:	pluknet
MD testing by:	marcel, marius, gonzo, andreast
2011-07-04 12:04:52 +00:00
alc
9cb2f04853 When iterating over a paging queue, explicitly check for PG_MARKER, instead
of relying on zeroed memory being interpreted as an empty PV list.

Reviewed by:	kib
2011-07-02 23:42:04 +00:00
jonathan
8c932faae4 Add some checks to ensure that Capsicum is behaving correctly, and add some
more explicit comments about what's going on and what future maintainers
need to do when e.g. adding a new operation to a sys_machdep.c.

Approved by: mentor(rwatson), re(bz)
2011-06-30 10:56:02 +00:00
alc
21902be08c Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this
option to vm_object_page_remove() asserts that the specified range of pages
is not mapped, or more precisely that none of these pages have any managed
mappings.  Thus, vm_object_page_remove() need not call pmap_remove_all() on
the pages.

This change not only saves time by eliminating pointless calls to
pmap_remove_all(), but it also eliminates an inconsistency in the use of
pmap_remove_all() versus related functions, like pmap_remove_write().  It
eliminates harmless but pointless calls to pmap_remove_all() that were being
performed on PG_UNMANAGED pages.

Update all of the existing assertions on pmap_remove_all() to reflect this
change.

Reviewed by:	kib
2011-06-29 16:40:41 +00:00
jonathan
624e733467 We may split today's CAPABILITIES into CAPABILITY_MODE (which has
to do with global namespaces) and CAPABILITIES (which has to do with
constraining file descriptors). Just in case, and because it's a better
name anyway, let's move CAPABILITIES out of the way.

Also, change opt_capabilities.h to opt_capsicum.h; for now, this will
only hold CAPABILITY_MODE, but it will probably also hold the new
CAPABILITIES (implying constrained file descriptors) in the future.

Approved by: rwatson
Sponsored by: Google UK Ltd
2011-06-29 13:03:05 +00:00
jhb
83fca1d193 Move {amd64,i386}/pci/pci_bus.c and {amd64,i386}/include/pci_cfgreg.h to
the x86 tree.  The $PIR code is still only enabled on i386 and not amd64.
While here, make the qpi(4) driver on conditional on 'device pci'.
2011-06-22 21:04:13 +00:00
jhb
3fa22c485f Oops, missed these in 223424.
Reported by:	jkim
2011-06-22 18:48:07 +00:00
jhb
b2d6c3b58f Use uintXX_t instead of u_intXX_t. 2011-06-22 17:55:16 +00:00
jhb
a0627f2e3f Add a helper routine to conditionally modify the start address of a
resource allocation from an x86 Host-PCI bridge driver so that it can be
reused by the ACPI Host-PCI bridge driver (and eventually the MPTable
Host-PCI bridge driver) instead of duplicating the same logic.  Note that
this means that hw.acpi.host_mem_start is now replaced with the
hw.pci.host_mem_start tunable that was already used in the non-ACPI case.
This also removes hw.acpi.host_mem_start on ia64 where it was not
applicable (the implementation was very x86-specific).

While here, adjust the logic to apply the new start address on any
"wildcard" allocation even if that allocation comes from a subset of
the allowable address range.

Reviewed by:	imp (1)
2011-06-22 16:15:15 +00:00
kib
abf14b630a Fix vfork. Add comments. 2011-06-18 12:13:28 +00:00
hselasky
88ca90f66a Enable USB 3.0 support by default in i386 and amd64 GENERIC kernels.
Discussed with:	joel @ and thompsa @
MFC after:	7 days
2011-06-14 20:30:49 +00:00
joel
6b76ad6c12 Enable sound support by default on i386 and amd64.
The generic sound driver has been added, along with enough
device-specific drivers to support the most common audio
chipsets.

We've discussed enabling it from time to time over the years
and we've received numerous requests from users, so we decided
that shipping 9.0 with working audio by default would be the
best thing to do.

Bug reports should be sent to the multimedia@ mailing list, as
usual.

Approved by:    mav
No objection:   re
2011-06-11 09:08:46 +00:00
jhb
10d756faa8 Implement BUS_ADJUST_RESOURCE() for the x86 drivers that sit between the
Host-PCI bridge drivers and nexus.
2011-06-10 12:30:16 +00:00
avg
74204e61b2 remove code for dynamic offlining/onlining of CPUs on x86
The code has definitely been broken for SCHED_ULE, which is a default
scheduler.  It may have been broken for SCHED_4BSD in more subtle ways,
e.g. with manually configured CPU affinities and for interrupt devilery
purposes.
We still provide a way to disable individual CPUs or all hyperthreading
"twin" CPUs before SMP startup.  See the UPDATING entry for details.

Interaction between building CPU topology and disabling CPUs still
remains fuzzy: topology is first built using all availble CPUs and then
the disabled CPUs should be "subtracted" from it.  That doesn't work
well if the resulting topology becomes non-uniform.

This work is done in cooperation with Attilio Rao who in addition to
reviewing also provided parts of code.

PR:		kern/145385
Discussed with:	gcooper, ambrisko, mdf, sbruno
Reviewed by:	attilio
Tested by:	pho, pluknet
X-MFC after:	never
2011-06-08 08:12:15 +00:00
attilio
31094255fe Bring back the number of CPU to 32. 2011-06-07 08:05:23 +00:00
attilio
fcefe479fe MFC 2011-06-06 21:38:39 +00:00
avg
9162612be7 don't use cpuid level 4 in x86 cpu topology detection if it's not supported
This regression was introduced in r213323.
There are probably no Intel cpus that support amd64 mode, but do not
support cpuid level 4, but it's better to keep i386 and amd64 versions
of this code in sync.

Discovered by:	pho
Tested by:	pho
MFC after:	2 weeks
2011-06-06 14:23:13 +00:00
kevlo
22d66d32a1 Bring back r222275. runfw(4) will statically link in rt2870.fw.uu
to the kernel, though I have MODULES_OVERRIDE="" in GENERIC.

Spotted by:	thompsa
2011-05-25 10:04:13 +00:00
kevlo
d0de3ee1d9 run(4) needs firmware loaded to work 2011-05-25 04:46:48 +00:00
attilio
2bb57ae016 Revert a patch that involountary sneaked in while I was MFCing. 2011-05-23 23:51:01 +00:00
attilio
6d7371f950 MFC 2011-05-23 01:17:30 +00:00
attilio
6a2b7fdc52 MFC 2011-05-18 16:01:29 +00:00
jkim
26831ce98b Update CPUID bits to reflect AMD Bulldozer and Intel Sandy Bridge features.
Note AMD dropped SSE5 extensions in order to avoid ISA overlap with Intel
AVX instructions.  The SSE5 bit was recycled as XOP extended instruction
bit, CVT16 was deprecated in favor of F16C (half-precision float conversion
instructions for AVX), and the remaining FMA4 (4-operand FMA instructions)
gained a separate CPUID bit.  Replace non-existent references with today's
CPUID specifications.
2011-05-17 22:36:16 +00:00
attilio
9ff3491e67 MFC 2011-05-13 20:58:48 +00:00
mdf
3d3b036f95 Move the ZERO_REGION_SIZE to a machine-dependent file, as on many
architectures (i386, for example) the virtual memory space may be
constrained enough that 2MB is a large chunk.  Use 64K for arches
other than amd64 and ia64, with special handling for sparc64 due to
differing hardware.

Also commit the comment changes to kmem_init_zero_region() that I
missed due to not saving the file.  (Darn the unfamiliar development
environment).

Arch maintainers, please feel free to adjust ZERO_REGION_SIZE as you
see fit.

Requested by:	alc
MFC after:	1 week
MFC with:	r221853
2011-05-13 19:35:01 +00:00
attilio
99e65551b9 MFC 2011-05-12 14:01:40 +00:00
dchagin
e506954648 Remove wrong comment.
MFC after:	1 week.
2011-05-11 17:57:15 +00:00
jkim
a112b2dac2 Add SC_PIXEL_MODE to GENERIC for amd64 and i386.
Requested by:	many
2011-05-10 16:44:16 +00:00
attilio
d7cb9e4814 MFC 2011-05-09 18:53:13 +00:00
jkim
6d3172737b Implement boot-time TSC synchronization test for SMP. This test is executed
when the user has indicated that the system has synchronized TSCs or it has
P-state invariant TSCs.  For the former case, we may clear the tunable if it
fails the test to prevent accidental foot-shooting.  For the latter case, we
may set it if it passes the test to notify the user that it may be usable.
2011-05-09 17:34:00 +00:00
attilio
a0b51ba62f MFC 2011-05-06 22:45:33 +00:00
avg
777b49b2a5 prepare code that does topology detection for amd cpus for bulldozer
This also introduces a new detection path for family 10h and newer
pre-bulldozer cpus, pre-10h hardware should not be affected.

Tested by:	Gary Jennejohn <gljennjohn@googlemail.com>
		(with pre-10h hardware)
MFC after:	2 weeks
2011-05-06 13:51:54 +00:00
attilio
fe4de567b5 Commit the support for removing cpumask_t and replacing it directly with
cpuset_t objects.
That is going to offer the underlying support for a simple bump of
MAXCPU and then support for number of cpus > 32 (as it is today).

Right now, cpumask_t is an int, 32 bits on all our supported architecture.
cpumask_t on the other side is implemented as an array of longs, and
easilly extendible by definition.

The architectures touched by this commit are the following:
- amd64
- i386
- pc98
- arm
- ia64
- XEN

while the others are still missing.
Userland is believed to be fully converted with the changes contained
here.

Some technical notes:
- This commit may be considered an ABI nop for all the architectures
  different from amd64 and ia64 (and sparc64 in the future)
- per-cpu members, which are now converted to cpuset_t, needs to be
  accessed avoiding migration, because the size of cpuset_t should be
  considered unknown
- size of cpuset_t objects is different from kernel and userland (this is
  primirally done in order to leave some more space in userland to cope
  with KBI extensions). If you need to access kernel cpuset_t from the
  userland please refer to example in this patch on how to do that
  correctly (kgdb may be a good source, for example).
- Support for other architectures is going to be added soon
- Only MAXCPU for amd64 is bumped now

The patch has been tested by sbruno and Nicholas Esborn on opteron
4 x 12 pack CPUs. More testing on big SMP is expected to came soon.
pluknet tested the patch with his 8-ways on both amd64 and i386.

Tested by:	pluknet, sbruno, gianni, Nicholas Esborn
Reviewed by:	jeff, jhb, sbruno
2011-05-05 14:39:14 +00:00
attilio
f756d5bed6 Revert md_assert_preempt() introduction.
Discussed with:	jeff, jhb
2011-05-04 20:29:40 +00:00
attilio
b29cc3952a MFC 2011-05-03 18:57:46 +00:00
jhb
f4c1badc8d Enable the new PCI-PCI bridge driver on amd64 and i386 by default. It can
be disabled via 'nooptions NEW_PCIB'.
2011-05-03 18:23:11 +00:00
jhb
51bd96b572 Reimplement how PCI-PCI bridges manage their I/O windows. Previously the
driver would verify that requests for child devices were confined to any
existing I/O windows, but the driver relied on the firmware to initialize
the windows and would never grow the windows for new requests.  Now the
driver actively manages the I/O windows.

This is implemented by allocating a bus resource for each I/O window from
the parent PCI bus and suballocating that resource to child devices.  The
suballocations are managed by creating an rman for each I/O window.  The
suballocated resources are mapped by passing the bus_activate_resource()
call up to the parent PCI bus.  Windows are grown when needed by using
bus_adjust_resource() to adjust the resource allocated from the parent PCI
bus.  If the adjust request succeeds, the window is adjusted and the
suballocation request for the child device is retried.

When growing a window, the rman_first_free_region() and
rman_last_free_region() routines are used to determine if the front or
end of the existing I/O window is free.  From using that, the smallest
ranges that need to be added to either the front or back of the window
are computed.  The driver will first try to grow the window in whichever
direction requires the smallest growth first followed by the other
direction if that fails.

Subtractive bridges will first attempt to satisfy requests for child
resources from I/O windows (including attempts to grow the windows).  If
that fails, the request is passed up to the parent PCI bus directly
however.

The PCI-PCI bridge driver will try to use firmware-assigned ranges for
child BARs first and only allocate a "fresh" range if that specific range
cannot be accommodated in the I/O window.  This allows systems where the
firmware assigns resources during boot but later wipes the I/O windows
(some ACPI BIOSen are known to do this) to "rediscover" the original I/O
window ranges.

The ACPI Host-PCI bridge driver has been adjusted to correctly honor
hw.acpi.host_mem_start and the I/O port equivalent when a PCI-PCI bridge
makes a wildcard request for an I/O window range.

The new PCI-PCI bridge driver is only enabled if the NEW_PCIB kernel option
is enabled.  This is a transition aide to allow platforms that do not
yet support bus_activate_resource() and bus_adjust_resource() in their
Host-PCI bridge drivers (and possibly other drivers as needed) to use the
old driver for now.  Once all platforms support the new driver, the
kernel option and old driver will be removed.

PR:		kern/143874 kern/149306
Tested by:	mav
2011-05-03 17:37:24 +00:00
attilio
fd4965df40 MFC @ r221324 2011-05-02 14:23:36 +00:00