When the FreeBSD kernel is loaded from Xen the symtab and strtab are
not loaded the same way as the native boot loader. This patch adds
three new global variables to ddb that can be used to specify the
exact position and size of those tables, so they can be directly used
as parameters to db_add_symbol_table. A new helper is introduced, so callers
that used to set ksym_start and ksym_end can use this helper to set the new
variables.
It also adds support for loading them from the Xen PVH port, that was
previously missing those tables.
Sponsored by: Citrix Systems R&D
Reviewed by: kib
ddb/db_main.c:
- Add three new global variables: ksymtab, kstrtab, ksymtab_size that
can be used to specify the position and size of the symtab and
strtab.
- Use those new variables in db_init in order to call db_add_symbol_table.
- Move the logic in db_init to db_fetch_symtab in order to set ksymtab,
kstrtab, ksymtab_size from ksym_start and ksym_end.
ddb/ddb.h:
- Add prototype for db_fetch_ksymtab.
- Declate the extern variables ksymtab, kstrtab and ksymtab_size.
x86/xen/pv.c:
- Add support for finding the symtab and strtab when booted as a Xen
PVH guest. Since Xen loads the symtab and strtab as NetBSD expects
to find them we have to adapt and use the same method.
amd64/amd64/machdep.c:
arm/arm/machdep.c:
i386/i386/machdep.c:
mips/mips/machdep.c:
pc98/pc98/machdep.c:
powerpc/aim/machdep.c:
powerpc/booke/machdep.c:
sparc64/sparc64/machdep.c:
- Use the newly introduced db_fetch_ksymtab in order to set ksymtab,
kstrtab and ksymtab_size.
than u_char.
Migrate post_filter to use an int for a CPU rather than u_char.
Change intr_event_bind() to use an int for CPU rather than u_char.
It touches the ppc, sparc64, arm and mips machdep code but it should
(hah!) be a no-op.
Tested:
* i386, AMD64 laptops
Reviewed by: jhb
resume that is a superset of a pcb. Move the FPU state out of the pcb and
into this new structure. As part of this, move the FPU resume code on
amd64 into a C function. This allows resumectx() to still operate only on
a pcb and more closely mirrors the i386 code.
Reviewed by: kib (earlier version)
pmap, unlike i386, and similar to i386/xen pv, does not tolerate
abandoned mappings for the freed pages.
Reported and tested by: dumbbell
Diagnosed and reviewed by: alc
Sponsored by: The FreeBSD Foundation
mapping size (currently unused). The flags includes the fault access
bits, wired flag as PMAP_ENTER_WIRED, and a new flag
PMAP_ENTER_NOSLEEP to indicate that pmap should not sleep.
For powerpc aim both 32 and 64 bit, fix implementation to ensure that
the requested mapping is created when PMAP_ENTER_NOSLEEP is not
specified, in particular, wait for the available memory required to
proceed.
In collaboration with: alc
Tested by: nwhitehorn (ppc aim32 and booke)
Sponsored by: The FreeBSD Foundation and EMC / Isilon Storage Division
MFC after: 2 weeks
The MD allocators were very common, however there were some minor
differencies. These differencies were all consolidated in the MI allocator,
under ifdefs. The defines from machine/vmparam.h turn on features required
for a particular machine. For details look in the comment in sys/sf_buf.h.
As result no MD code left in sys/*/*/vm_machdep.c. Some arches still have
machine/sf_buf.h, which is usually quite small.
Tested by: glebius (i386), tuexen (arm32), kevlo (arm32)
Reviewed by: kib
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
We continue to use pmap_enter() for that. For unwiring virtual pages, we
now use pmap_unwire(), which unwires a range of virtual addresses instead
of a single virtual page.
Sponsored by: EMC / Isilon Storage Division
provides support for a variety of low-end graphics hardware (SBus adapters,
Mach64, QEMU's framebuffer, XVR-100). A driver for at least the Creator3D
cards will have to be present before this can become the default console
driver.
To test vt(4) on sparc64, set kern.vty=vt at the loader prompt.
This commit does not add error returns to minidumpsys() or
textdump_dumpsys(); those can also be added later.
Submitted by: Conrad Meyer (EMC / Isilon storage division)
several reasons for this change:
pmap_change_wiring() has never (in my memory) been used to set the wired
attribute on a virtual page. We have always used pmap_enter() to do that.
Moreover, it is not really safe to use pmap_change_wiring() to set the wired
attribute on a virtual page. The description of pmap_change_wiring() says
that it assumes the existence of a mapping in the pmap. However, non-wired
mappings may be reclaimed by the pmap at any time. (See pmap_collect().)
Many implementations of pmap_change_wiring() will crash if the mapping does
not exist.
pmap_unwire() accepts a range of virtual addresses, whereas
pmap_change_wiring() acts upon a single virtual page. Since we are
typically unwiring a range of virtual addresses, pmap_unwire() will be more
efficient. Moreover, pmap_unwire() allows us to unwire superpage mappings.
Previously, we were forced to demote the superpage mapping, because
pmap_change_wiring() only allowed us to express the unwiring of a single
base page mapping at a time. This added to the overhead of unwiring for
large ranges of addresses, including the implicit unwiring that occurs at
process termination.
Implementations for arm and powerpc will follow.
Discussed with: jeff, marcel
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
These changes prevent sysctl(8) from returning proper output,
such as:
1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.
Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks
Sponsored by: Mellanox Technologies
This is derived from the mps(4) driver, but it supports only the 12Gb
IT and IR hardware including the SAS 3004, SAS 3008 and SAS 3108.
Some notes about this driver:
o The 12Gb hardware can do "FastPath" I/O, and that capability is included in
this driver.
o WarpDrive functionality has been removed, since it isn't supported in
the 12Gb driver interface.
o The Scatter/Gather list handling code is significantly different between
the 6Gb and 12Gb hardware. The 12Gb boards support IEEE Scatter/Gather
lists.
Thanks to LSI for developing and testing this driver for FreeBSD.
share/man/man4/mpr.4:
mpr(4) man page.
sys/dev/mpr/*:
mpr(4) driver files.
sys/modules/Makefile,
sys/modules/mpr/Makefile:
Add a module Makefile for the mpr(4) driver.
sys/conf/files:
Add the mpr(4) driver.
sys/amd64/conf/GENERIC,
sys/i386/conf/GENERIC,
sys/mips/conf/OCTEON1,
sys/sparc64/conf/GENERIC:
Add the mpr(4) driver to all config files that currently
have the mps(4) driver.
sys/ia64/conf/GENERIC:
Add the mps(4) and mpr(4) drivers to the ia64 GENERIC
config file.
sys/i386/conf/XEN:
Exclude the mpr module from building here.
Submitted by: Steve McConnell <Stephen.McConnell@lsi.com>
MFC after: 3 days
Tested by: Chris Reeves <chrisr@spectralogic.com>
Sponsored by: LSI, Spectra Logic
Relnotes: LSI 12Gb SAS driver mpr(4) added
the cpufreq code. Replace its use with smp_started. There's at least
one userland tool that still looks at the kern.smp.active sysctl, so
preserve it but point it to smp_started as well.
Discussed with: peter, jhb
MFC after: 3 days
Obtained from: Netflix
add it in kern.mk, but only if we're using clang. While this
option is supported by both clang and gcc, in the future there
may be changes to clang which change the defaults that require
a tweak to build our kernel such that other tools in our tree
will work. Set a good example by forcing -gdwarf-2 only for
clang builds, and only if the user hasn't specified another
dwarf level already. Update UPDATING to reflect the changed
state of affairs. This also keeps us from having to update
all the ARM kernels to add this, and also keeps us from
in the future having to update all the MIPS kernels and is
one less place the user will have to know to do something
special for clang and one less thing developers will need
to do when moving an architecture to clang.
Reviewed by: ian@
MFC after: 1 week
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.
Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.
Exp-run revealed no ports using it directly.
No objection from: arch@
Sponsored by: EMC / Isilon Storage Division
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.
MFC after: 3 weeks
casting to the appropriate type. (Note this fix cannot be done in
sys/sparc64/sparc64/spitfire.c, since that file is also included by
assembly source files.)
Reviewed by: marius
MFC after: 3 days
I/O windows, the default is to preserve the firmware-assigned resources.
PCI bus numbers are only managed if NEW_PCIB is enabled and the architecture
defines a PCI_RES_BUS resource type.
- Add a helper API to create top-level PCI bus resource managers for each
PCI domain/segment. Host-PCI bridge drivers use this API to allocate
bus numbers from their associated domain.
- Change the PCI bus and CardBus drivers to allocate a bus resource for
their bus number from the parent PCI bridge device.
- Change the PCI-PCI and PCI-CardBus bridge drivers to allocate the
full range of bus numbers from secbus to subbus from their parent bridge.
The drivers also always program their primary bus register. The bridge
drivers also support growing their bus range by extending the bus resource
and updating subbus to match the larger range.
- Add support for managing PCI bus resources to the Host-PCI bridge drivers
used for amd64 and i386 (acpi_pcib, mptable_pcib, legacy_pcib, and qpi_pcib).
- Define a PCI_RES_BUS resource type for amd64 and i386.
Reviewed by: imp
MFC after: 1 month
requires process descriptors to work and having PROCDESC in GENERIC
seems not enough, especially that we hope to have more and more consumers
in the base.
MFC after: 3 days
words, every architecture is now auto-sizing the kmem arena. This revision
changes kmeminit() so that the definition of VM_KMEM_SIZE_SCALE becomes
mandatory and the definition of VM_KMEM_SIZE becomes optional.
Replace or eliminate all existing definitions of VM_KMEM_SIZE. With
auto-sizing enabled, VM_KMEM_SIZE effectively became an alternate spelling
for VM_KMEM_SIZE_MIN on most architectures. Use VM_KMEM_SIZE_MIN for
clarity.
Change kmeminit() so that the effect of defining VM_KMEM_SIZE is similar to
that of setting the tunable vm.kmem_size. Whereas the macros
VM_KMEM_SIZE_{MAX,MIN,SCALE} have had the same effect as the tunables
vm.kmem_size_{max,min,scale}, the effects of VM_KMEM_SIZE and vm.kmem_size
have been distinct. In particular, whereas VM_KMEM_SIZE was overridden by
VM_KMEM_SIZE_{MAX,MIN,SCALE} and vm.kmem_size_{max,min,scale}, vm.kmem_size
was not. Remedy this inconsistency. Now, VM_KMEM_SIZE can be used to set
the size of the kmem arena at compile-time without that value being
overridden by auto-sizing.
Update the nearby comments to reflect the kmem submap being replaced by the
kmem arena. Stop duplicating the auto-sizing formula in every machine-
dependent vmparam.h and place it in kmeminit() where auto-sizing takes
place.
Reviewed by: kib (an earlier version)
Sponsored by: EMC / Isilon Storage Division
vm_pages. Provide trivial implementation which forwards the load to
_bus_dmamap_load_phys() page by page. Right now all architectures use
bus_dmamap_load_ma_triv().
Tested by: pho (as part of the functional patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
there as "kern.ipc.sendfile.readahead".
- Push all nsfbuf related tunables into MD code. Don't move them
to new namespace in favor of POLA.
Reviewed by: scottl
Approved by: re (gjb)
pmap_clear_reference() has had exactly one caller in the kernel for
several years, more precisely, since FreeBSD 8. Now, that call no
longer exists.
Approved by: re (kib)
Sponsored by: EMC / Isilon Storage Division
While here, correct all consumers to pass NULL instead of 0 as we pass
capability rights as pointers now, not uint64_t.
Reported by: Daniel Peyrolon
Tested by: Daniel Peyrolon
Approved by: re (marius)
an address in the first 2GB of the process's address space. This flag should
have the same semantics as the same flag on Linux.
To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address. While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.
Reviewed by: alc
Approved by: re (kib)
MADV_DONTNEED) and madvise(..., MADV_FREE). Specifically, introduce a new
pmap function, pmap_advise(), that operates on a range of virtual addresses
within the specified pmap, allowing for a more efficient implementation of
MADV_DONTNEED and MADV_FREE. Previously, the implementation of
MADV_DONTNEED and MADV_FREE relied on per-page pmap operations, such as
pmap_clear_reference(). Intuitively, the problem with this implementation
is that the pmap-level locks are acquired and released and the page table
traversed repeatedly, once for each resident page in the range
that was specified to madvise(2). A more subtle flaw with the previous
implementation is that pmap_clear_reference() would clear the reference bit
on all mappings to the specified page, not just the mapping in the range
specified to madvise(2).
Since our malloc(3) makes heavy use of madvise(2), this change can have a
measureable impact. For example, the system time for completing a parallel
"buildworld" on a 6-core amd64 machine was reduced by about 1.5% to 2.0%.
Note: This change only contains pmap_advise() implementations for a subset
of our supported architectures. I will commit implementations for the
remaining architectures after further testing. For now, a stub function is
sufficient because of the advisory nature of pmap_advise().
Discussed with: jeff, jhb, kib
Tested by: pho (i386), marcel (ia64)
Sponsored by: EMC / Isilon Storage Division