Replace the per-object resident and cached pages splay tree with a
path-compressed multi-digit radix trie.
Along with this, switch also the x86-specific handling of idle page
tables to using the radix trie.
This change is supposed to do the following:
- Allowing the acquisition of read locking for lookup operations of the
resident/cached pages collections as the per-vm_page_t splay iterators
are now removed.
- Increase the scalability of the operations on the page collections.
The radix trie does rely on the consumers locking to ensure atomicity of
its operations. In order to avoid deadlocks the bisection nodes are
pre-allocated in the UMA zone. This can be done safely because the
algorithm needs at maximum one new node per insert which means the
maximum number of the desired nodes is the number of available physical
frames themselves. However, not all the times a new bisection node is
really needed.
The radix trie implements path-compression because UFS indirect blocks
can lead to several objects with a very sparse trie, increasing the number
of levels to usually scan. It also helps in the nodes pre-fetching by
introducing the single node per-insert property.
This code is not generalized (yet) because of the possible loss of
performance by having much of the sizes in play configurable.
However, efforts to make this code more general and then reusable in
further different consumers might be really done.
The only KPI change is the removal of the function vm_page_splay() which
is now reaped.
The only KBI change, instead, is the removal of the left/right iterators
from struct vm_page, which are now reaped.
Further technical notes broken into mealpieces can be retrieved from the
svn branch:
http://svn.freebsd.org/base/user/attilio/vmcontention/
Sponsored by: EMC / Isilon storage division
In collaboration with: alc, jeff
Tested by: flo, pho, jhb, davide
Tested by: ian (arm)
Tested by: andreast (powerpc)
- We need to add "-mllvm -arm-enable-ehabi" to clangs CFLAGS when
generating the unwind tables to tell it to add the required directives to
the assembly it generates.
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.
The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho
will prevent the kernel from linking if the device driver are included
without the virtio module. Remove pci and scbus for the same reason.
Also explain the relationship and necessity of the virtio and virtio_pci
modules. Currently in FreeBSD, we only support VirtIO PCI, but it could
be replaced with a different interface (like MMIO) and the device
(network, block, etc) will still function.
Requested by: luigi
Approved by: grehan (mentor)
MFC after: 3 days
tunable by default.
This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel. They
can simply set kern.cam.ctl.disable=0 in loader.conf.
The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.
UPDATING: Explain the change in the CTL configuration, and
how users can enable CTL if they would like to use
it.
sys/conf/options: Add a new option, CTL_DISABLE, that prevents CTL
from initializing.
ctl.c: If CTL_DISABLE is turned on, don't initialize.
i386/conf/GENERIC,
amd64/conf/GENERIC: Re-enable device ctl, and add the CTL_DISABLE
option.
precise time event generation. This greatly improves granularity of
callouts which are not anymore constrained to wait next tick to be
scheduled.
- Extend the callout KPI introducing a set of callout_reset_sbt* functions,
which take a sbintime_t as timeout argument. The new KPI also offers a
way for consumers to specify precision tolerance they allow, so that
callout can coalesce events and reduce number of interrupts as well as
potentially avoid scheduling a SWI thread.
- Introduce support for dispatching callouts directly from hardware
interrupt context, specifying an additional flag. This feature should be
used carefully, as long as interrupt context has some limitations
(e.g. no sleeping locks can be held).
- Enhance mechanisms to gather informations about callwheel, introducing
a new sysctl to obtain stats.
This change breaks the KBI. struct callout fields has been changed, in
particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t'
(8 bytes) and another 'sbintime_t' field was added for precision.
Together with: mav
Reviewed by: attilio, bde, luigi, phk
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo (amd64, sparc64), marius (sparc64), ian (arm),
markj (amd64), mav, Fabian Keil
fail interrupt handler, there seems to be either a broken batch of them
or a tendency to develop a defect which causes this interrupt to fire
inadvertedly. Given that apart from this problem these machines work
just fine, add a tunable allowing the setup of the power fail interrupt
to be disabled.
While at it, remove the DEBUGGER_ON_POWERFAIL compile time option and
make that behavior also selectable via the newly added tunable.
- Apparently, it's no longer a problem to call shutdown_nice(9) from within
an interrupt filter (some other drivers in the tree do the same). So
change the power fail interrupt from an handler in order to simplify the
code and get rid of a !INTR_MPSAFE handler.
- Use NULL instead of 0 for pointers.
MFC after: 1 week
- Add support for IPv6 rx csum offload
- Finally switch mxge from using its own driver lro, to
using tcp_lro
MFC after: 7 days
Sponsored by: Myricom Inc.
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
* Illumos zfs issue #3035 [1] LZ4 compression support in ZFS.
LZ4 is a new high-speed BSD-licensed compression algorithm created
by Yann Collet that delivers very high compression and decompression
performance compared to lzjb (>50% faster on compression, >80% faster
on decompression and around 3x faster on compression of incompressible
data), while giving better compression ratio [1].
This version of LZ4 corresponds to upstream's [2] revision 85.
Please note that for obvious reasons this is not backward read
compatible. This means once a pool have LZ4 compressed data, these
data can no longer be read by older ZFS implementations.
Local changes:
- On-stack hash table disabled and using kernel slab allocator
instead, at this time. This requires larger kernel thread stack
for zio workers. This may change in the future should we adjusted
the zio workers' thread stack size.
- likely and unlikely will be undefined if they are already defined,
this is required for i386 XEN build.
- Removed De Bruijn sequence based __builtin_ctz family of builtins
in favor of the latter. Both GCC and clang supports these builtins.
- Changed the way the LZ4 code detects endianness.
- Manual pages modifications to mention the feature based on Illumos
counterpart.
- Boot loader changes to make it support LZ4 decompression.
[1] https://www.illumos.org/issues/3035
[2] http://code.google.com/p/lz4/source/list
Obtained from: Illumos (13921:9d721847e469)
Tested on: FreeBSD/amd64
MFC after: 1 month
Only during very early boot, before malloc(9) is functional (SI_SUB_KMEM),
the static ktr_buf_init is used. Size of the static buffer is determined
by a new kernel option KTR_BOOT_ENTRIES. Its default value is 1024.
This commit builds on top of r243046.
Reviewed by: alc
MFC after: 17 days
And provide kernel compiler version as a sysctl as well.
This is useful while we have gcc and clang cohabitation.
This could be even more useful when we have support
for external toolchains.
In cooperation with: mjg
MFC after: 13 days
generating binary diffs.
- Constify a few strings used in the driver.
- Style changes to make the driver compile with default clang settings.
Approved by: HighPoint Technologies
MFC after: 3 days
'bhyve' was developed by grehan@ and myself at NetApp (thanks!).
Special thanks to Peter Snyder, Joe Caradonna and Michael Dexter for their
support and encouragement.
Obtained from: NetApp
Implement an FDT attachment for altera_avgen(4).
Portions of the changeset updating DTS and device.hints will be merged
separately.
Sponsored by: DARPA, AFRL
Rework altera_avgen(4) to cleanly(ish) separate nexus bus
attachment from the driver itself. This should allow us to
plug in an fdt attachment more easily.
Sponsored by: DARPA, AFRL
Add an Intel StrataFlash (isf) driver FDT attachment.
Portions of the original changeset hooking up FDT use for BERI will be
merged separately.
Sponsored by: DARPA, AFRL
Provided a bus_space implementation for FDT, modelled on
bus_space_generic, but with a local version of the map address
routine that does a P->V translation, as is the case with NLM's
similar routine for XLP. It's not clear to me that this is the
right solution -- possibly this belongs in simplebus -- however,
it is sufficient to get the DE4 LED driver working.
Sponsored by: DARPA, AFRL
implement the BSM audit trail format. Rename the kernel versions of the
files to match the userspace filenames so that it's easier to work out
what they correspond to, and therefore ensure they are kept in-sync.
Obtained from: TrustedBSD Project
but LDFLAGS is not (yet) passed on to the linker (via SYSTEM_LD et al).
Do so now. As such, any kernel configuration can now define linker
flags by setting LDFLAGS as normal and not have to revert to hacks
like setting DEBUG for flags that do not relate to debugging (see
sys/powerpc/conf/MPC85XX).
reducing the number of runtime checks done by the SDK code.
o) Group board/CPU information at early startup by subject matter, so that e.g.
CPU information is adjacent to CPU information and board information is
adjacent to board information.
- In sys/ofed/drivers/infiniband/core/cma.c, an enum struct member is
interpreted as an int, so cast it to an int.
- In sys/ofed/drivers/infiniband/core/ud_header.c, initialize the
packet_length variable in ib_ud_header_init(), to prevent undefined
behaviour.
- In sys/ofed/drivers/infiniband/ulp/sdp/sdp_rx.c, call rdma_notify()
with the correct enum type and value.
- In sys/ofed/include/linux/pci.h, change the PCI_DEVICE and PCI_VDEVICE
macros to use C99 struct initializers, so additional members can be
overridden.
Reviewed by: delphij, Garrett Cooper <yanegomi@gmail.com>
MFC after: 1 week
use the -mprofiler-epilogue option if the compiler is clang, as the flag
is not supported. While here, fix up the value indentations.
MFC after: 1 week
Allow textdumps to be called explicitly from DDB.
If "dump" is called in DDB and textdumps are enabled then abort the
dump and tell the user to turn off textdumps.
Add options TEXTDUMP_PREFERRED to turn textdumps on by default.
Add options TEXTDUMP_VERBOSE to be a bit more verbose while textdumping.
Reviewed by: rwatson
MFC after: 2 weeks
files. It used to be in files.mips before the clean-room rewrite and
really doesn't belong there. If we need to grow arch specific code,
we can move it into $ARCH/$ACH/siba_machdep.c.
advantages. First, PV entries are roughly half the size. Second, this
allocator doesn't access the paging queues, and thus it allows for the
removal of the page queues lock from this pmap.
Replace all uses of the page queues lock by a R/W lock that is private
to this pmap.
Tested by: marcel
on the related functionality in the runtime via the sysctl variable
net.pfil.forward. It is turned off by default.
Sponsored by: Yandex LLC
Discussed with: net@
MFC after: 2 weeks
more appropriate named kernel options for the very distinct
send and receive path.
"options SOCKET_SEND_COW" enables VM page copy-on-write based
sending of data on an outbound socket.
NB: The COW based send mechanism is not safe and may result
in kernel crashes.
"options SOCKET_RECV_PFLIP" enables VM kernel/userspace page
flipping for special disposable pages attached as external
storage to mbufs.
Only the naming of the kernel options is changed and their
corresponding #ifdef sections are adjusted. No functionality
is added or removed.
Discussed with: alc (mechanism and limitations of send side COW)
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
GIANT from VFS. In addition, disconnect also netsmb, which is a base
requirement for SMBFS.
In the while SMBFS regular users can use FUSE interface and smbnetfs
port to work with their SMBFS partitions.
Also, there are ongoing efforts by vendor to support in-kernel smbfs,
so there are good chances that it will get relinked once properly locked.
This is not targeted for MFC.
GIANT from VFS. This code is particulary broken and fragile and other
in-kernel implementations around, found in other operating systems,
don't really seem clean and solid enough to be imported at all.
If someone wants to reconsider in-kernel NTFS implementation for
inclusion again, a fair effort for completely fixing and cleaning it
up is expected.
In the while NTFS regular users can use FUSE interface and ntfs-3g
port to work with their NTFS partitions.
This is not targeted for MFC.
GIANT from VFS. In addition, disconnect also netncp, which is a base
requirement for NWFS.
In the possibility of a future maintenance of the code and later
readd to the FreeBSD base, maybe we should think about a better location
for netncp. I'm not entirely sure the / top location is actually right,
however I will let network people to comment on that more specifically.
This is not targeted for MFC.
sdchi encapsulates a generic SD Host Controller logic that relies on
actual hardware driver for register access.
sdhci_pci implements driver for PCI SDHC controllers using new SDHCI
interface
No kernel config modifications are required, but if you load sdhc
as a module you must switch to sdhci_pci instead.
This has been developed during 2 summer of code mandates and being revived
by gnn recently.
The functionality in this commit mirrors entirely content of fusefs-kmod
port, which doesn't need to be installed anymore for -CURRENT setups.
In order to get some sparse technical notes, please refer to:
http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html
or to the project branch:
svn://svn.freebsd.org/base/projects/fuse/
which also contains granular history of changes happened during port
refinements. This commit does not came from the branch reintegration
itself because it seems svn is not behaving properly for this functionaly
at the moment.
Partly Sponsored by: Google, Summer of Code program 2005, 2011
Originally submitted by: ilya, Csaba Henk <csaba-ml AT creo DOT hu >
In collabouration with: pho
Tested by: flo, gnn, Gustau Perez,
Kevin Oberman <rkoberman AT gmail DOT com>
MFC after: 2 months
This is important to secure a small timeframe at boot time, when
network is already configured, but pf(4) is not yet.
PR: kern/171622
Submitted by: Olivier Cochard-LabbИ <olivier cochard.me>
reside, and move there ipfw(4) and pf(4).
o Move most modified parts of pf out of contrib.
Actual movements:
sys/contrib/pf/net/*.c -> sys/netpfil/pf/
sys/contrib/pf/net/*.h -> sys/net/
contrib/pf/pfctl/*.c -> sbin/pfctl
contrib/pf/pfctl/*.h -> sbin/pfctl
contrib/pf/pfctl/pfctl.8 -> sbin/pfctl
contrib/pf/pfctl/*.4 -> share/man/man4
contrib/pf/pfctl/*.5 -> share/man/man5
sys/netinet/ipfw -> sys/netpfil/ipfw
The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.
Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.
The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.
Discussed with: bz, luigi
type of compiler is being used (currently clang or gcc). COMPILER_TYPE
is set in the new bsd.compiler.mk file based on the value of the CC
variable or, should it prove informative, by running ${CC} --version
and examining the output.
To avoid negative performance impacts in the default case and correct
value for COMPILER_TYPE type is determined and passed in the environment
of submake instances while building world.
Replace adhoc attempts at determining the compiler type by examining
CC or MK_CLANG_IS_CC with checks of COMPILER_TYPE. This eliminates
bootstrapping complications when first setting WITH_CLANG_IS_CC.
Sponsored by: DARPA, AFRL
Reviewed by: Yamaya Takashi <yamayan@kbh.biglobe.ne.jp>, imp, linimon
(with some modifications post review)
MFC after: 2 weeks
with no output. Add "echo" at the end these shell commands whose output is
assigned to a variable's value to ensure there is some output.
Submitted by: John Van Horne <jvanhorne@juniper.net>
drivers:
- Remove scsi_low_pisa.*, they were unused.
- Remove <compat/netbsd/physio_proc.h> and calls to the stubs in that
header. They were empty nops.
- Retire sl_xname and use device_get_nameunit() and device_printf() with
the underlying device_t instead.
- Remove unused {ct,ncv,nsp,stg}print() functions.
- Remove empty SOFT_INTR_REQUIRED() macro and the unused sl_irq member.
generator, found on IvyBridge and supposedly later CPUs, accessible
with RDRAND instruction.
From the Intel whitepapers and articles about Bull Mountain, it seems
that we do not need to perform post-processing of RDRAND results, like
AES-encryption of the data with random IV and keys, which was done for
Padlock. Intel claims that sanitization is performed in hardware.
Make both Padlock and Bull Mountain random generators support code
covered by kernel config options, for the benefit of people who prefer
minimal kernels. Also add the tunables to disable hardware generator
even if detected.
Reviewed by: markm, secteam (simon)
Tested by: bapt, Michael Moll <kvedulv@kvedulv.de>
MFC after: 3 weeks
- Move mwlfw from {amd64,i386}/conf/NOTES to sys/conf/NOTES (mwl(4) is
already present in sys/conf/NOTES).
- Remove duplicate mwl(4) entries from {amd64,i386}/conf/NOTES.
- While here, add a description to the sfxge line in amd64/conf/NOTES.
- Provide missing function that can do hashing of arbitrary sized buffer.
- Refetch lookup3.c and do only minimal edits to it, so that diff between
our jenkins_hash.c and lookup3.c is minimal.
- Add declarations for jenkins_hash(), jenkins_hash32() to sys/hash.h.
- Document these functions in hash(9)
Obtained from: http://burtleburtle.net/bob/c/lookup3.c
warnings in sys/gnu/fs/xfs. The only warnings that still need to be
suppressed are those about array bound overruns of flexible array
members in xfs_dir2_{block,sf}.c, which are too expensive (in terms of
cascading code changes) to fix.
MFC after: 1 week
X-MFC-With: r239959
linking it statically into the kernel. With our gcc in base there are
no warnings, so also remove the WERROR= from the module makefile.
Noted by: Eir Nym <eirnym@gmail.com>
MFC after: 1 week
firmware objects by adding --no-warn-mismatch to the linker flags,
add --no-warn-mismatch when linking firmware objects (*.fwo) as
well as to the link of the main kernel file. This permits firmware
modules to be statically linked into an ia64 kernel.
uudecode, and NORMAL_FWO to use ld to build the .fwo file) and use those
instead of explicit ld/uudecode invocations in sys/conf/files. Apart from
increasing readability, this makes it possible to adjust the flags used for
firmware objects in one place.
MFC after: 2 weeks
handler and not more statically.
Unfortunately, it seems that this is not ideal for new platform bringup
and boot low level development (which needs ktr_cpumask to be effective
before tunables can be setup).
Because of this, add a way to statically initialize cpusets, by passing
an list of initializers, divided by commas. Also, provide a way to enforce
an all-set mask, for above mentioned initializers.
This imposes some differences on how KTR_CPUMASK is setup now as a
kernel option, and in particular this makes the words specifications
backward wrt. what is currently in -CURRENT. In order to avoid mismatches
between KTR_CPUMASK definition and other way to setup the mask
(tunable, sysctl) and to print it, change the ordering how
cpusetobj_print() and cpusetobj_scan() acquire the words belonging
to the set.
Please give a look to sys/conf/NOTES in order to understand how the
new format is supposed to work.
Also, ktr manpages will be updated shortly by gjb which volountereed
for this.
This patch won't be merged because it changes a POLA (at least
from the theoretical standpoint) and this is however a patch that
proves to be effective only in development environments.
Requested by: rpaulo
Reviewed by: jeff, rpaulo
The driver attempts to support all documented parts, but has only been
tested with the 512Mbit part on the Terasic DE4 FPGA board. It should be
trivial to adapt the driver's attach routine to other embedded boards
using with any parts in the family.
Also import isfctl(8) which can be used to erase sections of the flash.
Sponsored by: DARPA, AFRL
which can be synthesised in Altera FPGAs. An altera_sdcardc device
probes during the boot, and /dev/altera_sdcard devices come and go as
inserted and removed. The device driver attaches directly to the
Nexus, as is common for system-on-chip device drivers.
This IP core suffers a number of significant limitations, including a
lack of interrupt-driven I/O -- we must implement timer-driven polling,
only CSD 0 cards (up to 2G) are supported, there are serious memory
access issues that require the driver to verify writes to memory-mapped
buffers, undocumented alignment requirements, and erroneous error
returns. The driver must therefore work quite hard, despite a fairly
simple hardware-software interface. The IP core also supports at most
one outstanding I/O at a time, so is not a speed demon.
However, with the above workarounds, and subject to performance
problems, it works quite reliably in practice, and we can use it for
read-write mounts of root file systems, etc.
Sponsored by: DARPA, AFRL
CPU cores on Altera FPGAs. The device driver allows memory-mapped devices
on Altera's Avalon SoC bus to be exported to userspace via device nodes.
device.hints directories dictate device name, permissible access methods,
physical address and length, and I/O alignment. Devices can be accessed
using read(2)/write(2), but also memory mapped in userspace using mmap(2).
Devices attach directly to the Nexus, as is common for embedded device
drivers; in the future something more mature might be desirable. There is
currently no facility to support directing device-originated interrupts to
userspace.
In the future, this device driver may be renamed to socgen(4), as it can
in principle also be used with other system-on-chip (SoC) busses, such as
Axi on ASICs and FPGAs. However, we have only tested it on Avalon busses
with memory-mapped ROMs, frame buffers, etc.
Sponsored by: DARPA, AFRL
Bluespec Extensible RISC Implementation (BERI) processor. BERI is a 64-bit
MIPS ISA soft CPU core that can be synthesised to Altera and Xilinx FPGAs,
and is being used for CPU and OS research at several institutions.
Sponsored by: DARPA, AFRL
on PowerPC support. This was clearly not something syscons was
designed to do (very specific assumptions about the nature of VGA
consoles on PCs), but fortunately others have long since blazed
the way on making it work regardless of that.
Sponsored by: DARPA, AFRL
reach single user mode using a memory disk device as the file system.
This port includes the framebuffer driver, the PIC driver, a platform
driver and the GPIO driver. The IPC driver (to talk to IOS kernels) is
not yet written but there's a placeholder for it.
There are still some MMU problems and to get a working system you need to
patch locore32.S. Since we haven't found the best way yet to address that
problem, we're not committing those changes yet. The problem is related to
the different BAT layout on the Wii and to the fact that the Homebrew
loader doesn't clean up the special registers (including the 8 BATs)
before passing control to us.
You'll need a Wii with Homebrew loader and a TV that can do NTSC (for now).
Submitted by: Margarida Gouveia
used, serves very little value given that FreeBSD runs on real H/W
for a long time.
Note that SKI is open-source (see http://ski.sourceforge.net), so
if there's interest and value again, then this code can be revived.
Discussed with: jhb
r238211:
Support TARGET_ARCH=armv6 and TARGET_ARCH=armv6eb
This adds a new TARGET_ARCH for building on ARM
processors that support the ARMv6K multiprocessor
extensions. In particular, these processors have
better support for TLS and mutex operations.
This mostly touches a lot of Makefiles to extend
existing patterns for inferring CPUARCH from ARCH.
It also configures:
* GCC to default to arm1176jz-s
* GCC to predefine __FreeBSD_ARCH_armv6__
* gas to default to ARM_ARCH_V6K
* uname -p to return 'armv6'
* make so that MACHINE_ARCH defaults to 'armv6'
It also changes a number of headers to use
the compiler __ARM_ARCH_XXX__ macros to configure
processor-specific support routines.
Submitted by: Tim Kientzle <kientzle@freebsd.org>
Cummulative patch of changes that are not vendor-specific:
- ARMv6 and ARMv7 architecture support
- ARM SMP support
- VFP/Neon support
- ARM Generic Interrupt Controller driver
- Simplification of startup code for all platforms
advantages. First, PV entries are roughly half the size. Second, this
allocator doesn't access the paging queues, and thus it will allow for the
removal of the page queues lock from this pmap.
Fix a rather serious bug in pmap_remove_write(). After removing write
access from the specified page's first mapping, pmap_remove_write() then
used the wrong "next" pointer. Consequently, the page's second, third,
etc. mappings were not write protected.
Tested by: jchandra
Make the process of embedding MDROOT images less perilous by
makeing the target that links kernel and embedding the image
depend on the image. This means, if the image doesn't exist you
find out before you try to boot from it and that if you change
the image you don't have to touch some random source file to
cause a rebuild.
Don't hide that we're embedding the image.
ktr(4), was constrained to be a power of two. Remove this constraint and
update sys/conf/NOTES accordingly.
Reviewed by: jhb
Approved by: gnn (mentor)
Sponsored by: Google Summer of Code 2012
subdevice ahciem. Emulate SEMB SES device from AHCI LED interface to expose
it to users in form of ses(4) CAM device. If we ever see AHCI controllers
supporting SES of SAF-TE over I2C as described by specification, they should
fit well into this new picture.
Sponsored by: iXsystems, Inc.
* Introduce TX DMA setup/teardown methods, mirroring what's done in
the RX path.
Although the TX DMA descriptor is setup via ath_desc_alloc() /
ath_desc_free(), there TX status descriptor ring will be allocated
in this path.
* Remove some of the TX EDMA capability probing from the RX path and
push it into the new TX EDMA path.
arm platform. Add all the atmel boards to the ATMEL kernel for
testing purposes. Until boot loader arg parsing of baord type
is done, this won't actually be able to do the runtime selection.
correctly. We now iterate the EFI memory descriptors once and collect all
the information in a single pass. This includes:
1. The I/O port base address,
2. The PAL memory region. Have the physmem API track this.
3. Memory descriptors of memory we can't use, like bad memory, runtime
services code & data, etc. Have the physmem API track these.
4. memory descriptors of memory we can use or re-use, such as free
memory, boot time services code & data, loader code & data, etc.
These are added by the physmem API.
Since the PBVM page table and pages are in memory described as loader
data, inform the physmem API of chunks that need to be delated from the
available physical memory.
While here, remove Maxmem and replace it with the better named paddr_max.
Maxmem was defined as physmem, which is generally wrong. Now, paddr_max
is properly defined as the largesty physical address.
The upshot of all this is that:
1. We properly determine realmem.
2. We maximize physmem by re-using memory where possible.
3. We remove complexity from ia64_init() in machdep.c.
4. Remove confusion about realmem, physmem & Maxmem.
The new ia64_physmem_alloc() is to replace pmap_steal_memory() in pmap.c,
as well as replace the handcrafted allocation of the VHPT for the BSP in
pmap_bootstrap() in pmap.c. This is step 2 and addresses the manipulation
of phys_avail after it is being created.
shared code update and small changes in core required
Add support for new i210/i211 devices
Improve queue calculation based on mac type
MFC after:5 days
This driver does not yet handle multiple chip selects properly.
Note that the NAND infrastructure does not perform full page
reads or writes, which means that this driver cannot make use
of the hardware ECC that is otherwise present.
the aggressive pattern matching of the :C modifier. I tested build and
install in 2 phases, however with different solutions, resulting in the
breakage. Mea culpa.
The solution is to break out the all: target. This causes a few lines of
code duplication, but now the all: target works as it should, and the
other targets continue to work as they did before.
While I'm here, add a ===> header line to the start of each port build
to make it easier to find/more clear in the logs.
Asus laptops. It is alike to acpi_asus(4), but uses WMI interface instead
of separate ACPI device.
On Asus EeePC T101MT netbook it allows to handle hotkeys and on/off WLAN,
Bluetooth, LCD backlight, camera, cardreader and touchpad.
On Asus UX31A ultrabook it allows to handle hotkeys, on/off WLAN, Bluetooth,
Wireless LED, control keyboard backlight brightness, monitor temperature
and fan speed. LCD brightness control doesn't work now for unknown reason,
possibly requiring some video card initialization.
Sponsored by: iXsystems, Inc.
important for those that use -DNO_CLEAN routinely, since it will prevent
installing stale stuff, and even more important when the port is upgraded
to a newer version. When the user doesn't use -DNO_CLEAN, this will create
an infinitesimal amount of extra work, but won't hurt anything.
This is necessary because the ports tree has flags that prevent the ususal
'update the build if newer source files exist' logic from doing what it
would do in the base.
This includes a few new fields in each RXed frame:
* per chain RX RSSI (ctl and ext);
* current RX chainmask;
* EVM information;
* PHY error code;
* basic RX status bits (CRC error, PHY error, etc).
This is primarily to allow me to do some userland PHY error processing
for radar and spectral scan data. However since EVM and per-chain RSSI
is provided, others may find it useful for a variety of tasks.
The default is to not compile in the radiotap vendor extensions, primarily
because tcpdump doesn't seem to handle the particular vendor extension
layout I'm using, and I'd rather not break existing code out there that
may be (badly) parsing the radiotap data.
Instead, add the option 'ATH_ENABLE_RADIOTAP_VENDOR_EXT' to your kernel
configuration file to enable these options.
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile
are provided.
Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.
- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.
Build-tested with make universe.
30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE
Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe
Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe
Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)
LOCALBASE/bin and sbin to PATH, allowing dependencies to be found;
adding SRC_BASE and OSVERSION to match the new kernel, and putting
the related builds under MAKEOBJDIRPREFIX so that they only need
to be built once per kernel.
In addition to the PR this includes ideas/contributions from crees
and matthew.
PR: ports/161452
Submitted by: Garrett Cooper <yanegomi@gmail.com>
redboot. Support is very preiminary and likely needs some work. Also,
do some minor code shuffling of the FreeBSD /boot/loader metadata
parsing code. This code is preliminary and should be used with
caution.
is enabled, sets values based on the metadata passed in. Otherwise
fake_preload_metadata is called. Change the default parse_boot_param
to default_parse_boot_param. Enable this functionality only on the mv
platform, which is where most of the code is from.
Reviewed by: cognet, Ian Lapore
- Remove cpuset stopped_cpus which is no longer used.
- Add a short comment for cpuset suspended_cpus clearing.
- Fix the un-ordered x86/acpica/acpi_wakeup.c in conf/files.amd64 and i386.
Pointed-out by: attilio@
suspend/resume procedures are minimized among them.
common:
- Add global cpuset suspended_cpus to indicate APs are suspended/resumed.
- Remove acpi_waketag and acpi_wakemap from acpivar.h (no longer used).
- Add some variables in acpi_wakecode.S in order to minimize the difference
among amd64 and i386.
- Disable load_cr3() because now CR3 is restored in resumectx().
amd64:
- Add suspend/resume related members (such as MSR) in PCB.
- Modify savectx() for above new PCB members.
- Merge acpi_switch.S into cpu_switch.S as resumectx().
i386:
- Merge(and remove) suspendctx() into savectx() in order to match with
amd64 code.
Reviewed by: attilio@, acpi@
CAM_DEBUG_CDB, CAM_DEBUG_PERIPH and CAM_DEBUG_PROBE) by default.
List of these flags can be modified with CAM_DEBUG_COMPILE kernel option.
CAMDEBUG kernel option still enables all possible debug, if not overriden.
Additional 50KB of kernel size is a good price for the ability to debug
problems without rebuilding the kernel. In case where size is important,
debugging can be compiled out by setting CAM_DEBUG_COMPILE option to 0.
Make the default role NONE if target mode is selected. This
allows ctl(8) to switch to/from target mode via knob settings.
If we default to role 'none', this causes a reset of the
24XX f/w which then causes initiators to wake up and notice
when we come online.
Reviewed by: kdm
MFC after: 2 weeks
Sponsored by: Spectralogic
the i/o regions of the vnode data space. The implementation is quite
simple-minded, it uses the list of the lock requests, ordered by
arrival time. Each request may be for read or for write. The
implementation is fair FIFO.
MFC after: 2 month
implementation specific vs. the common architecture definition.
Bring PPC4XX defines (PSL, SPR, TLB). Note the new definitions under
BOOKE_PPC4XX are not used in the code yet.
This change set is not supposed to affect existing E500 support, it's just
another reorg step before bringing support for E500mc, E5500 and PPC465.
Obtained from: AppliedMicro, Freescale, Semihalf
Revamp the CAM enclosure services driver.
This updated driver uses an in-kernel daemon to track state changes and
publishes physical path location information\for disk elements into the
CAM device database.
Sponsored by: Spectra Logic Corporation
Sponsored by: iXsystems, Inc.
Submitted by: gibbs, will, mav
into partitions.
Partitions are created based on data in dts file which are
extracted and interpreted by slicer.
Obtained from: Semihalf
Supported by: FreeBSD Foundation, Juniper Networks
There's some TX path TDMA code in if_ath_tx.c which should be migrated
out, but first I should likely try and verify/fix/repair the TDMA support
in 9.x and -HEAD.
* migrate the rx processing out into if_ath_rx.c
* migrate the TSF functions into if_ath_tsf.h, as inlines
This is in prepration for supporting the EDMA RX routines, required to
support the AR93xx series NICs.
TODO:
* ath_start() shouldn't be private, but it's called as part of
the RX path. I should likely migrate ath_rx_tasklet() back into
if_ath.c and then return this to be 'static'. The RX code really
shouldn't need to see TX routines (and vice versa.)
* ath_beacon_* should be in if_ath_beacon.[ch].
* ath_tdma_* should be in if_ath_tdma.[ch] ...
The NAND Flash environment consists of several distinct components:
- NAND framework (drivers harness for NAND controllers and NAND chips)
- NAND simulator (NANDsim)
- NAND file system (NAND FS)
- Companion tools and utilities
- Documentation (manual pages)
This work is still experimental. Please use with caution.
Obtained from: Semihalf
Supported by: FreeBSD Foundation, Juniper Networks
* Add in the AR724x support. It probes the same as an AR8216/AR8316, so
just add in a hint to force the probe success rather than auto-detecting
it.
* Add in the missing entries from conf/files, lacking in the previous
commit.
The register values and CPU port / mirror port initialisation value was
obtained from Linux OpenWRT ag71xx_ar7240.c.
The DELAY(1000) to let things settle is my local workaround. For some
reason, PHY4 doesn't seem to probe very reliably without it. It's quite
possible that we're missing some MDIO bus initialisation code in if_arge
for the AR724x case. As I dislike DELAY() workarounds in general, it's
definitely worth trying to figure out why this is the case.
Tested on: AP93 (AR7240) reference design
Obtained from: Linux OpenWRT
ports. This currently is a nop, but will soon be used to allow
support for multiple boards to be built into one kernel (starting with
AT91RM9200 and expanding out from there).
This is only done if the ARGE_MDIO option is included.
* Shuffle the arge MDIO bus into a separate device, that needs to be
probed early (use hint.argemdio.X.order=0)
* hint.arge.X.mdio now specifies which miiproxy to rendezvous with.
* Call MAC/MDIO bus init during MDIO attach, not arge attach.
This is done regardless:
* Shift the arge MAC and MDIO bus reset code into separate functions
and call it early during MDIO bus attach. It's required for
correct MDIO bus IO to occur on AR71xx/AR91xx devices.
* Remove the AR71xx/AR91xx centric assumption that there's only one
MDIO bus. The initial code mapped miibus0(arge0) and miibus1(arge1)
MII register operations to the MII0 (arge0) register space. The
AR724x (and later, upcoming chipsets) have two MDIO busses and
the second is very much in use.
TODO:
* since the multiphy behaviour has changed (where now a phymask of >1
PHY will still be enumerated), multiphy setups may be quite wrong.
I'll go and fix these so they still have a chance of working, at least.
until the switch PHY support appears in -HEAD.
Submitted by: Stefan Bethke <stb@lassitu.de>
MDIO/MII rendezvous proxy.
* Add an 'mdio' bus, which is the "IO" side of an MII bus (but by design
can be anything which implements the underlying register access API.)
* Add 'miiproxy' and 'mdioproxy', which provides a rendezvous mechanism
for MII busses to appear hanging off arbitrary busses (ie, that aren't
necessarily a traditional looking MII bus.)
MII busses can now hang off anything that implements an mdiobus.
For the AR71xx SoC, there's one MDIO bus but two MII busses. So to
properly support two or more real PHYs, this can be done:
# arge0 MDIO bus - there's no arge1 MDIO bus for AR71xx
hint.argemdio.0.at="nexus0"
hint.argemdio.0.maddr=0x19000000
hint.argemdio.0.msize=0x1000
hint.argemdio.0.order=0
# Create two mdioproxy instances
hint.mdioproxy.0.at="mdio0"
hint.mdioproxy.1.at="mdio0"
# .. and with a follow-up patch
hint.arge.0.mdio=mdioproxy0
hint.arge.1.mdio=mdioproxy0
TODO:
* Do a sweep or two and add appropriate locking in mdio/mdioproxy/miiproxy.
Submitted by: Stefan Bethke <stb@lassitu.de>
Reviewed by: ray
defined by the SNIA Common RAID Disk Data Format Specification v2.0.
Supports multiple volumes per array and multiple partitions per disk.
Supports standard big-endian and Adaptec's little-endian byte ordering.
Supports all single-layer RAID levels. Dual-layer RAID levels except
RAID10 are not supported now because of GEOM RAID design limitations.
Some work is still to be done, but the present code already manages basic
interoperation with RAID BIOS of the Adaptec 1430SA SATA RAID controller.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
- Mark 'sdp' as requiring 'inet'.
- Always include "opt_inet.h" and "opt_inet6.h" and modify the IB
driver Makefiles to honor WITH/WITHOUT_INET/INET6/_SUPPORT options
to determine what should be enabled during a module build.
- Fix the mlxen(4) driver and the core IB code to compile without
if INET is disabled (including when both INET and INET6 are disabled).
Reviewed by: bz
MFC after: 2 weeks
identical now that the bus spaces are unified under sys/x86.
Replace them with a single uart_cpu_x86.c.
o delete uart_cpu_i386.c
o move uart_cpu_amd64.c to uart_cpu_x86.c
o update files.amd64 and files.i386 accordingly.
used in the code which needs to implement some specific
behaviour when being run under QEMU.
- Make PXA UART probe code to work under QEMU gumstix, which
doesn't emulate all the ports properly.
First cut of new HW support from LSI and merge into FreeBSD.
Supports Drake Skinny and ThunderBolt cards.
MFhead_mfi r227574
Style
MFhead_mfi r227579
Use bus_addr_t instead of uintXX_t.
MFhead_mfi r227580
MSI support
MFhead_mfi r227612
More bus_addr_t and remove "#ifdef __amd64__".
MFhead_mfi r227905
Improved timeout support from Scott.
MFhead_mfi r228108
Make file.
MFhead_mfi r228208
Fixed botched merge of Skinny support and enhanced handling
in call back routine.
MFhead_mfi r228279
Remove superfluous !TAILQ_EMPTY() checks before TAILQ_FOREACH().
MFhead_mfi r228310
Move mfi_decode_evt() to taskqueue.
MFhead_mfi r228320
Implement MFI_DEBUG for 64bit S/G lists.
MFhead_mfi r231988
Restore structure layout by reverting the array header to
use [0] instead of [1].
MFhead_mfi r232412
Put wildcard pattern later in the match table.
MFhead_mfi r232413
Use lower case for hexadecimal numbers to match surrounding
style.
MFhead_mfi r232414
Add more Thunderbolt variants.
MFhead_mfi r232888
Don't act on events prior to boot or when shutting down.
Add hw.mfi.detect_jbod_change to enable or disable acting
on JBOD type of disks being added on insert and removed on
removing. Switch hw.mfi.msi to 1 by default since it works
better on newer cards.
MFhead_mfi r233016
Release driver lock before taking Giant when deleting children.
Use TAILQ_FOREACH_SAFE when items can be deleted. Make code a
little simplier to follow. Fix a couple more style issues.
MFhead_mfi r233620
Update mfi_spare/mfi_array with the actual number of elements
for array_ref and pd. Change these max. #define names to avoid
name space collisions. This will require an update to mfiutil
It avoids mfiutil having to do a magic calculation.
Add a note and #define to state that a "SYSTEM" disk is really
what the firmware calls a "JBOD" drive.
Thanks to the many that helped, LSI for the initial code drop,
mav, delphij, jhb, sbruno that all helped with code and testing.
This makes our naming scheme more closely match other systems and the
expectations of much third-party software. MIPS builds which are little-endian
should require and exhibit no changes. Big-endian TARGET_ARCHes must be
changed:
From: To:
mipseb mips
mipsn32eb mipsn32
mips64eb mips64
An entry has been added to UPDATING and some foot-shooting protection (complete
with warnings which should become errors in the near future) to the top-level
base system Makefile.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
The 'make depend' rules have to use custom -I paths for the special compat
includes for the opensolaris/zfs headers.
This option will pull in the couple of files that are shared with dtrace,
but they appear to correctly use the MODULE_VERSION/MODULE_DEPEND rules
so loader should do the right thing, as should kldload.
Reviewed by: pjd (glanced at)
Add a Simple polled driver iicoc for the OpenCores I2C controller. This
is used in Netlogic XLP processors.
Submitted by: Sreekanth M. S. (kanthms at netlogicmicro com)
The earlier version of the driver is sys/mips/rmi/dev/iic/ds1374u.c
Convert all references to ds1374u to ds1374, and use DEVMETHOD_END.
Also update the license header as Netlogic is now Broadcom.
sys/dev/mps/mps_sas.c:861:1: error: function 'mpssas_discovery_timeout' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
mpssas_discovery_timeout(void *data)
^
Because the driver is obtained from upstream, we don't want to modify
it; just silence the warning instead, it is harmless.
MFC after: 3 days
- Replace MIPS24K-specific code with more generic framework that will
make adding new CPU support easier
- Add MIPS24K support for new framework
- Limit backtrace depth to 1 for stability reasons and add option
HWPMC_MIPS_BACKTRACE to override this limitation
longer serve any purpose. Prior to r157446, they served a purpose
because there was a fixed amount of kernel virtual address space
reserved for pv entries at boot time. However, since that change pv
entries are accessed through the direct map, and so there is no limit
imposed by a fixed amount of kernel virtual address space.
Fix a couple of nearby style issues.
Reviewed by: jhb, kib
MFC after: 1 week
- Merge r232744 changes to pc98.
(Allow a kernel to be built with 'nodevice atpic'.)
- Move ICU related defines from x86/isa/atpic.c to x86/isa/icu.h and
use them in x86/x86/intr_machdep.c.
Reviewed by: jhb
to match reality: clang does _not_ disable SSE automatically when
-mno-mmx is used, you have to specify -mno-sse explicitly.
Note this was the case even before r232894, which only makes a change in
the 'positive' flag case; e.g. when you specify -msse, MMX gets enabled
too.
MFC after: 1 week
required for the ABI the kernel is being built for.
XXX This is implemented in a kind-of nasty way that involves including source
files, but it's still an improvement.
o) Retire ISA_* options since they're unused and were always wrong.
of the Cavium Simple Executive, which violates large function growth rules
in such a way that simply increasing the large function growth parameter is
insufficient.