This is done per request/suggestion from John Baldwin
who introduced the option. Trying to resume normal
system operation after a panic is very unpredictable
and dangerous. It will become even more dangerous
when we allow a thread in panic(9) to penetrate all
lock contexts.
I understand that the only purpose of this option was
for testing scenarios potentially resulting in panic.
Suggested by: jhb
Reviewed by: attilio, jhb
X-MFC-After: never
Approved by: re (kib)
This patch is going to help in cases like mips flavours where you
want a more granular support on MAXCPU.
No MFC is previewed for this patch.
Tested by: pluknet
Approved by: re (kib)
This patch adds support for the Netlogic XLP mips64 processors in
the common MIPS code. The changes are :
- Add CPU_NLM processor type
- Add cases for CPU_NLM, mostly were CPU_RMI is used.
- Update cache flush changes for CPU_NLM
- Add kernel build configuration files for xLP.
In collaboration with: Prabhath Raman <prabhathpr at netlogicmicro com>
Approved by: bz(re), jmallett, imp(mips)
Allow changing the trampoline ABI with makeoptions, this will allow
us to have a trampoline with a different ABI from the kernel.
Useful in cases where we have to boot a 64 bit kernel from a
bootloader which supports only 32 bit or vice versa.
Approved by: bz(re), jmallett, imp
option that is highly recommended to be adjusted in too much
documentation while doing nothing in FreeBSD since r2729 (rev 1.1).
ipcs(1) needs to be recompiled as it is accessing _KERNEL private
variables.
Reviewed by: jhb (before comment change on linux code)
Sponsored by: Sandvine Incorporated
This option will enable Capsicum capabilities, which provide a fine-grained
mask on operations that can be performed on file descriptors.
Approved by: mentor (rwatson), re (Capsicum blanket ok)
Sponsored by: Google Inc
to do with global namespaces) and CAPABILITIES (which has to do with
constraining file descriptors). Just in case, and because it's a better
name anyway, let's move CAPABILITIES out of the way.
Also, change opt_capabilities.h to opt_capsicum.h; for now, this will
only hold CAPABILITY_MODE, but it will probably also hold the new
CAPABILITIES (implying constrained file descriptors) in the future.
Approved by: rwatson
Sponsored by: Google UK Ltd
can be tested.
This doesn't at all actually do radar detection! It's just
so developers who wish to test the net80211 DFS code can easily
do so. Without this flag, the DFS channels are never marked
DFS and thus the DFS stuff doesn't run.
o Setting td_intr_frame to the XIVs trap frame because it's referenced
by the ET event handler.
o Signal EOI to the CPU before calling the registered XIV handlers.
This prevents lost ITC interrupts, which cause starvation in one-shot
mode.
o Adding support for IPI_HARDCLOCK with corresponding per-CPU counters.
o Have the APs call cpu_initclocks() so as to limited the scattering of
clock related initialization. cpu_initclocks() calls the <self>_bsp()
or <self>_ap() version accordingly.
o Uncomment the ET clock handling in cpu_idle().
o Update the DDB 'show pcpu' output for the new MD fields.
o Entirely rewritten ia64_ih_clock(). Note that we don't create as many
clock XIVs as we have CPUs, as is done on PowerPC. It doesn't scale.
We can only have 240 XIVs and we can have more CPUs than that. There's
a single intrcnt index for the cumulative clock ticks and we keep per
CPU counts in the PCPU stats structure.
o Register the ITC by hooking SI_SUB_CONFIGURE (2nd order).
Open issues:
o Clock interrupts can still be lost. Some tweaking is still necessary.
Thanks to: mav@ for his support, feedback and explanations.
ET stats while committing:
eris% sysctl machdep.cpu | grep nclks
machdep.cpu.0.nclks: 24007
machdep.cpu.1.nclks: 22895
machdep.cpu.2.nclks: 13523
machdep.cpu.3.nclks: 9342
machdep.cpu.4.nclks: 9103
machdep.cpu.5.nclks: 9298
machdep.cpu.6.nclks: 10039
machdep.cpu.7.nclks: 9479
eris% vmstat -i | grep clock
clock 108599 50
the future, but presents a set of simple block devices for now. With
(forthcoming) boot loader support or vfs.root.mountfrom, allows booting
PS3s from disk.
Submitted by: glevand <geoffrey.levand@mail.ru>
struct inpcbgroup. pcbgroups, or "connection groups", supplement the
existing inpcbinfo connection hash table, which when pcbgroups are
enabled, might now be thought of more usefully as a per-protocol
4-tuple reservation table.
Connections are assigned to connection groups base on a hash of their
4-tuple; wildcard sockets require special handling, and are members
of all connection groups. During a connection lookup, a
per-connection group lock is employed rather than the global pcbinfo
lock. By aligning connection groups with input path processing,
connection groups take on an effective CPU affinity, especially when
aligned with RSS work placement (see a forthcoming commit for
details). This eliminates cache line migration associated with
global, protocol-layer data structures in steady state TCP and UDP
processing (with the exception of protocol-layer statistics; further
commit to follow).
Elements of this approach were inspired by Willman, Rixner, and Cox's
2006 USENIX paper, "An Evaluation of Network Stack Parallelization
Strategies in Modern Operating Systems". However, there are also
significant differences: we maintain the inpcb lock, rather than using
the connection group lock for per-connection state.
Likewise, the focus of this implementation is alignment with NIC
packet distribution strategies such as RSS, rather than pure software
strategies. Despite that focus, software distribution is supported
through the parallel netisr implementation, and works well in
configurations where the number of hardware threads is greater than
the number of NIC input queues, such as in the RMI XLR threaded MIPS
architecture.
Another important difference is the continued maintenance of existing
hash tables as "reservation tables" -- these are useful both to
distinguish the resource allocation aspect of protocol name management
and the more common-case lookup aspect. In configurations where
connection tables are aligned with hardware hashes, it is desirable to
use the traditional lookup tables for loopback or encapsulated traffic
rather than take the expense of hardware hashes that are hard to
implement efficiently in software (such as RSS Toeplitz).
Connection group support is enabled by compiling "options PCBGROUP"
into your kernel configuration; for the time being, this is an
experimental feature, and hence is not enabled by default.
Subject to the limited MFCability of change dependencies in inpcb,
and its change to the inpcbinfo init function signature, this change
in principle could be merged to FreeBSD 8.x.
Reviewed by: bz
Sponsored by: Juniper Networks, Inc.
This is in no way a complete DFS/radar detection implementation!
It merely creates an abstracted interface which allows for future
development of the DFS radar detection code.
Note: Net80211 already handles the bulk of the DFS machinery,
all we need to do here is figure out that a radar event has occured
and inform it as such. It then drives the DFS state engine for us.
The "null" DFS radar detection module is included by default;
it doesn't require a device line.
This commit:
* Adds a simple abstracted layer for radar detection state -
sys/dev/ath/ath_dfs/;
* Implements a null DFS module which doesn't do anything;
(ie, implements the exact behaviour at the moment);
* Adds hooks to the ath driver to process received radar events
and gives the DFS module a chance to determine whether
a radar has been detected.
Obtained from: Atheros
This introduce all the underlying support for making this possible (via
the function cpusetobj_strscan() and keeps ktr_cpumask exported. sparc64
implements its own assembly primitives for tracing events and needs to
properly check it. Anyway the sparc64 logic is not implemented yet due
to lack of knowledge (by me) and time (by marius), but it is just a
matter of using ktr_cpumask when possible.
Tested and fixed by: pluknet
Reviewed by: marius
filters working. (All other filters - switch without L2 info rewrite,
steer, and drop - were already fully-functional).
Some contrived examples of "switch" filters with L2 rewriting:
# cxgbetool t4nex0 iport 0 dport 80 action switch vlan +9 eport 3
Intercept all packets received on physical port 0 with TCP port 80 as
destination, insert a vlan tag with VID 9, and send them out of port 3.
# cxgbetool t4nex0 sip 192.168.1.1/32 ivlan 5 action switch \
vlan =9 smac aa:bb:cc:dd:ee:ff eport 0
Intercept all packets (received on any port) with source IP address
192.168.1.1 and VLAN id 5, rewrite the VLAN id to 9, rewrite source mac
to aa:bb:cc:dd:ee:ff, and send it out of port 0.
MFC after: 1 week
that will connect all of the various sensors and fan control modules on
Apple hardware with software-controlled fans (e.g. all G5 systems).
MFC after: 1 month
Alexander Best (arundel@).
For clang, -fdiagnostics-show-option is enabled by default, but for gcc it
isn't. This option will report which -W* flag was responsible for triggering
a certain warning. This will bring gcc warnings closer to the ones clang emits
and might also help developers track down tinderbox failures a bit quicker.
Submitted by: arundel
wihtout updating world (good transition aide for -current, but also
allows kernels to be built on -stable the old way too). This likely
should go away around FreeBSD 10.0 or so.
WITH{OUT,}_KERNEL_SYMBOLS (defaulting to WITH). In the fullness of
time, likely around 2020, INSTALL_NODEBUG will be removed. For now,
don't print a warning when using INSTALL_NODEBUG, but that will be
coming soon.
flags are also specified. This change makes use of this behaviour and removes
unneeded -mno-* flags.
Note that clang does not yet enable AVX support for any CPU. However at some
point in the future it will and since we definitely want to disable it for the
kernel, we might as well add the -mno-avx flag now.
Submitted by: arundel
driver would verify that requests for child devices were confined to any
existing I/O windows, but the driver relied on the firmware to initialize
the windows and would never grow the windows for new requests. Now the
driver actively manages the I/O windows.
This is implemented by allocating a bus resource for each I/O window from
the parent PCI bus and suballocating that resource to child devices. The
suballocations are managed by creating an rman for each I/O window. The
suballocated resources are mapped by passing the bus_activate_resource()
call up to the parent PCI bus. Windows are grown when needed by using
bus_adjust_resource() to adjust the resource allocated from the parent PCI
bus. If the adjust request succeeds, the window is adjusted and the
suballocation request for the child device is retried.
When growing a window, the rman_first_free_region() and
rman_last_free_region() routines are used to determine if the front or
end of the existing I/O window is free. From using that, the smallest
ranges that need to be added to either the front or back of the window
are computed. The driver will first try to grow the window in whichever
direction requires the smallest growth first followed by the other
direction if that fails.
Subtractive bridges will first attempt to satisfy requests for child
resources from I/O windows (including attempts to grow the windows). If
that fails, the request is passed up to the parent PCI bus directly
however.
The PCI-PCI bridge driver will try to use firmware-assigned ranges for
child BARs first and only allocate a "fresh" range if that specific range
cannot be accommodated in the I/O window. This allows systems where the
firmware assigns resources during boot but later wipes the I/O windows
(some ACPI BIOSen are known to do this) to "rediscover" the original I/O
window ranges.
The ACPI Host-PCI bridge driver has been adjusted to correctly honor
hw.acpi.host_mem_start and the I/O port equivalent when a PCI-PCI bridge
makes a wildcard request for an I/O window range.
The new PCI-PCI bridge driver is only enabled if the NEW_PCIB kernel option
is enabled. This is a transition aide to allow platforms that do not
yet support bus_activate_resource() and bus_adjust_resource() in their
Host-PCI bridge drivers (and possibly other drivers as needed) to use the
old driver for now. Once all platforms support the new driver, the
kernel option and old driver will be removed.
PR: kern/143874 kern/149306
Tested by: mav
and our users complained when broken.
Similarly add LINT-NOINET, and for at least documentation purposes add
LINT-NOIP (which compiles out INET and INET6 and couple of NIC drivers).
Tested by: make universe (if you broke it since you fix it)
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 2 weeks
developers committing new code with broken include directories.
Fix a few whitespace issues.
Improve a couple of comments.
-W is now deprecated and is referred to as -Wextra (see gcc(1)).
Submitted by: arundel
use the PBVM. This eliminates the implied hardcoding of the
physical address at which the kernel needs to be loaded. Using the
PBVM makes it possible to load the kernel irrespective of the
physical memory organization and allows us to replicate kernel text
on NUMA machines.
While here, reduce the direct-mapped page size to the kernel's
page size so that we can support memory attributes better.
kernel configurations to apply WITH_* WITHOUT_* knobs we use for
module building as well to restrict or control opt_*.h flags.
Reviewed by: imp, +
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 2 weeks
to get the files for an IPv6 only kernel as well, remove extra inet6
option where not needed.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
Add some comments at #endifs given more nestedness. To make the compiler
happy, some default initializations were added in accordance with the style
on the files.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
as well compiling out most functions adding or extending #ifdef INET
coverage.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
The AR9130 is an AR9160/AR5416 family WMAC which is glued directly
to the AR913x SoC peripheral bus (APB) rather than via a PCI/PCIe
bridge.
The specifics:
* A new build option is required to use the AR9130 - AH_SUPPORT_AR9130.
This is needed due to the different location the RTC registers live
with this chip; hopefully this will be undone in the future.
This does currently mean that enabling this option will break non-AR9130
builds, so don't enable it unless you're specifically building an image
for the AR913x SoC.
* Add the new probe, attach, EEPROM and PLL methods specific to Howl.
* Add a work-around to ah_eeprom_v14.c which disables some of the checks
for endian-ness and magic in the EEPROM image if an eepromdata block
is provided. This'll be fixed at a later stage by porting the ath9k
probe code and making sure it doesn't break in other setups (which
my previous attempt at this did.)
* Sprinkle Howl modifications throughput the interrupt path - it doesn't
implement the SYNC interrupt registers, so ignore those.
* Sprinkle Howl chip powerup/down throughout the reset path; the RTC methods
were
* Sprinkle some other Howl workarounds in the reset path.
* Hard-code an alternative setup for the AR_CFG register for Howl, that
sets up things suitable for Big-Endian MIPS (which is the only platform
this chip is glued to.)
This has been tested on the AR913x based TP-Link WR-1043nd mode, in
legacy, HT/20 and HT/40 modes.
Caveats:
* 2ghz has only been tested. I've not seen any 5ghz radios glued to this
chipset so I can't test it.
* AR5416_INTERRUPT_MITIGATION is not supported on the AR9130. At least,
it isn't implemented in ath9k. Please don't enable this.
* This hasn't been tested in MBSS mode or in RX/TX block-aggregation mode.
Expose ip_icmp.c to INET6 as well and only export badport_bandlim()
along with the two sysctls in the non-INET case.
The bandlim types work for all cases I reviewed in IPv6 as well and
the sysctls are available as we export net.inet.* from in_proto.c.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 4 days
device in /dev/ create symbolic link with adY name, trying to mimic old ATA
numbering. Imitation is not complete, but should be enough in most cases to
mount file systems without touching /etc/fstab.
- To know what behavior to mimic, restore ATA_STATIC_ID option in cases
where it was present before.
- Add some more details to UPDATING.
set the f_flags field of "struct statfs". This had the interesting
effect of making the NFSv4 mounts "disappear" after r221014,
since NFSMNT_NFSV4 and MNT_IGNORE became the same bit.
Move the files used for a diskless NFS root from sys/nfsclient
to sys/nfs in preparation for them to be used by both NFS
clients. Also, move the declaration of the three global data
structures from sys/nfsclient/nfs_vfsops.c to sys/nfs/nfs_diskless.c
so that they are defined when either client uses them.
Reviewed by: jhb
MFC after: 2 weeks
unconditionally backing out r193997, so that they are available for
IPv6-only setups as well.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 5 days
stack. It means that all legacy ATA drivers are disabled and replaced by
respective CAM drivers. If you are using ATA device names in /etc/fstab or
other places, make sure to update them respectively (adX -> adaY,
acdX -> cdY, afdX -> daY, astX -> saY, where 'Y's are the sequential
numbers for each type in order of detection, unless configured otherwise
with tunables, see cam(4)).
ataraid(4) functionality is now supported by the RAID GEOM class.
To use it you can load geom_raid kernel module and use graid(8) tool
for management. Instead of /dev/arX device names, use /dev/raid/rX.
This is destined to be a lightweight and optional set of ALQ
probes for debugging events which are just impossible to debug
with printf/log (eg packet TX/RX handling; AMPDU handling.)
The probes and operations themselves will appear in subsequent
commits.
While in_pseudo() etc. is often used in offloading feature support,
in_cksum() is mostly used to fix some broken hardware.
Keeping both around for the moment allows us to compile NIC drivers
even in an IPv6 only environment without the need to mangle them
with #ifdef INETs in a way they are not prepared for. This will
leave some dead code paths that will not be exercised for IPv6.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC after: 3 days
This support has not worked for several years, and is not likely to work
again, unless Intel decides to release a native FreeBSD version of their
compiler. ;)
While it does not provide any functionality for IPv6, it provides
the sysctl nodes for net.inet.* that a lot of functionality shared
between IPv4 and IPv6 depends on. We cannot change these anymore
without breaking a lot of management and tuning.
In case of IPv6 only, we compile out everything but the sysctl node
declarations.
Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
MFC After: 5 days
embedded flash stores.
Some devices - notably those with uboot - don't have an
explicit partition table (eg like Redboot's FIS.)
geom_map thus provides an easy way to export the hard-coded
flash layout as geom providers for use by filesystems and
other tools.
It also includes a "search" function which allows for
dynamic creation of partition layouts where the device only
has a single hard-coded partition. For example, if
there is a "kernel+rootfs" partition, a single image can
be created which appends the rootfs after the kernel with
an appropriate search string. geom_map can be told to
search for said search string and create a partition
beginning after it.
Submitted by: Aleksandr Rybalko <ray@dlink.ua>
on per-device basis.
- While adding support for per-device sysctls, merge from graid branch
support for ADA_TEST_FAILURE kernel option, which opens few more sysctl,
allowing to simulate read and write errors for testing purposes.
environments into the kernel environment.
The eventual aim is to replace these with specific drivers for
the various bootloaders (redboot, uboot, etc.) This however will
work for the time being until it can be properly addressed.
Submitted by: Aleksandr Rybalko <ray@dlink.ua>
Introduce the AHB glue for Atheros embedded systems. Right now it's
hard-coded for the AR9130 chip whose support isn't yet in this HAL;
it'll be added in a subsequent commit.
Kernel configuration files now need both 'ath' and 'ath_pci' devices; both
modules need to be loaded for the ath device to work.
on the set of rules it maintains and the current resource usage. It also
privides userland API to manage that ruleset.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)
and per-loginclass resource accounting information, to be used by the new
resource limits code. It's connected to the build, but the code that
actually calls the new functions will come later.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)
instead of 0x100000. As a side effect, an amd64 kernel now loads at
physical address 0x200000 instead of 0x100000. This is probably for the
best because it avoids the use of a 2MB page mapping for the first 1MB of
the kernel that also spans the fixed MTRRs. However, getmemsize() still
thinks that the kernel loads at 0x100000, and so the physical memory between
0x100000 and 0x200000 is lost. Fix this problem by replacing the hard-wired
constant in getmemsize() by a symbol "kernphys" that is defined by the
linker script.
In collaboration with: kib
memory detected from Redboot, or overrides the "otherwise" case
if no Redboot information was found.
Some AR71XX platforms don't use Redboot (eg TP-LINK devices using
UBoot; some later Ubiquiti devices which apparently also use
UBoot) and at least one plain out lies - the Ubiquiti LS-SR71A
Redboot says there's 16mb of RAM when in fact there's 32mb.
A more "clean" solution will be needed at a later date.
register changes when compiled with SCHIZO_DEBUG and take advantage
of them.
- Add support for the XMITS Fireplane/Safari to PCI-X bridges. I tought
I'd need this for a Sun Fire 3800, which then turned out to not being
equipped with such a bridge though. The support for these should be
complete but given that it hasn't actually been tested probing is
disabled for now.
This required a way to alter the XMITS configuration in case a PCI-X
device is found further down the device tree so the sparc64 specific
ofw_pci kobj was revived with a ofw_pci_setup_device method, which is
called by the ofw_pcibus code for every device added.
- A closer inspection of the OpenSolaris code indicates that consistent
DMA flushing/syncing as well as the block store workaround should be
applied with every BUS_DMASYNC_POSTREAD instead of in a wrapper around
interrupt handlers for devices behind PCI-PCI bridges only as suggested
by the documentation (code for the latter actually exists in OpenSolaris
but is disabled by default), which also makes more sense.
- Add a workaround for Casinni/Skyhawk combinations. Chances are that
this solves the crashes seen when using the the on-board Casinni NICs
of Sun Fire V480 equipped with centerplanes other than 501-6780 or
501-6790. This also takes advantage of the ofw_pci_setup_device method.
- Mark some unused parameters as such.
Add new RAID GEOM class, that is going to replace ataraid(4) in supporting
various BIOS-based software RAIDs. Unlike ataraid(4) this implementation
does not depend on legacy ata(4) subsystem and can be used with any disk
drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4)
with `options ATA_CAM`). To make code more readable and extensible, this
implementation follows modular design, including core part and two sets
of modules, implementing support for different metadata formats and RAID
levels.
Support for such popular metadata formats is now implemented:
Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage.
Such RAID levels are now supported:
RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT.
For any all of these RAID levels and metadata formats this class supports
full cycle of volume operations: reading, writing, creation, deletion,
disk removal and insertion, rebuilding, dirty shutdown detection
and resynchronization, bad sector recovery, faulty disks tracking,
hot-spare disks. For Intel and Promise formats there is support multiple
volumes per disk set.
Look graid(8) manual page for additional details.
Co-authored by: imp
Sponsored by: Cisco Systems, Inc. and iXsystems, Inc.
It's still not ready for prime-time - there's some TX niggles with these 11n
cards that I'm still trying to wrap my head around, and AMPDU-TX is just not
implemented so things will come to a crashing halt if you're not careful.
services or PAL procedures. The new implementation is based on
specific functions that are known to be called in certain scenarios
only. This in particular fixes the PAL call to obtain information
about translation registers. In general, the new implementation does
not bank on virtual addresses being direct-mapped and will work when
the kernel uses PBVM.
When new scenarios need to be supported, new functions are added if
the existing functions cannot be changed to handle the new scenario.
If a single generic implementation is possible, it will become clear
in due time.
While here, change bootinfo to a pointer type in anticipation of
future development.