Do this by forcing inclusion of
sys/cddl/compat/opensolaris/sys/debug_compat.h
via -include option into all source files from OpenSolaris.
Note that this -include option must always be after -include opt_global.h.
Additionally, remove forced definition of DEBUG for some modules and fix
their build without DEBUG.
Also, meaning of DEBUG was overloaded to enable WITNESS support for some
OpenSolaris (primarily ZFS) locks. Now this overloading is removed and
that use of DEBUG is replaced with a new option OPENSOLARIS_WITNESS.
MFC after: 17 days
- change the SI_SUB_RUN_SCHEDULER sysinits in hv_utilc and
hv_netvsc_drv_freebsd.c to SI_SUB_KTHREAD_IDLE, since the
former is no longer in FreeBSD.
The use of these SYSINITs can probably be removed.
Support chipsets are the Realtek RTL8188SU, RTL8191SU, and RTL8192SU.
Many thanks to Idwer Vollering for porting/writing the man page and for
testing.
Reviewed by: adrian, hselasky
Obtained from: OpenBSD
Tested by: kevlo, Idwer Vollering <vidwer at gmail.com>
* Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option.
The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow.
* random(4) device doesn't really depend on rijndael-*. Yarrow, however, does.
* Add random_adaptors.[ch] which is basically a store of random_adaptor's.
random_adaptor is basically an adapter that plugs in to random(4).
random_adaptor can only be plugged in to random(4) very early in bootup.
Unplugging random_adaptor from random(4) is not supported, and is probably a
bad idea anyway, due to potential loss of entropy pools.
We currently have 3 random_adaptors:
+ yarrow
+ rdrand (ivy.c)
+ nehemeiah
* Remove platform dependent logic from probe.c, and move it into
corresponding registration routines of each random_adaptor provider.
probe.c doesn't do anything other than picking a specific random_adaptor
from a list of registered ones.
* If the kernel doesn't have any random_adaptor adapters present then the
creation of /dev/random is postponed until next random_adaptor is kldload'ed.
* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
system wide one.
Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: obrien
* Make Yarrow an optional kernel component -- enabled by "YARROW_RNG" option.
The files sha2.c, hash.c, randomdev_soft.c and yarrow.c comprise yarrow.
* random(4) device doesn't really depend on rijndael-*. Yarrow, however, does.
* Add random_adaptors.[ch] which is basically a store of random_adaptor's.
random_adaptor is basically an adapter that plugs in to random(4).
random_adaptor can only be plugged in to random(4) very early in bootup.
Unplugging random_adaptor from random(4) is not supported, and is probably a
bad idea anyway, due to potential loss of entropy pools.
We currently have 3 random_adaptors:
+ yarrow
+ rdrand (ivy.c)
+ nehemeiah
* Remove platform dependent logic from probe.c, and move it into
corresponding registration routines of each random_adaptor provider.
probe.c doesn't do anything other than picking a specific random_adaptor
from a list of registered ones.
* If the kernel doesn't have any random_adaptor adapters present then the
creation of /dev/random is postponed until next random_adaptor is kldload'ed.
* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
system wide one.
Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: obrien
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.
Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire. On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc. It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel). On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.
There are 4 tracers on a chip. A tracer can trace only in one direction
(tx or rx). For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port. This is a
small subset of what the hardware can do. A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.
/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire
If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.
/* all done */
# cxgbetool t5nex0 tracer 0 disable
# cxgbetool t5nex0 tracer 1 disable
# cxgbetool t5nex0 tracer list
# ifconfig t5nex0 destroy
USB mouse and USB modem classes. Hopefully someone will find
these examples useful when implementing USB device side drivers
using the FreeBSD USB stack.
As part of this commit, add an nvme_strvis() function which borrows
heavily from cam_strvis(). This will allow stripping of
leading/trailing whitespace and also handle unprintable characters
in model/serial numbers. This function goes into a new nvme_util.c
file which is used by both the driver and nvmecontrol.
Sponsored by: Intel
Reviewed by: carl
MFC after: 3 days
be pulled into FreeBSD. From now, FreeBSD will be considered the
upstream repo.
First step: move the drivers away from the contrib area and into
the base system.
A follow-on commit will include the drivers in the amd64 GENERIC kernel.
ixgbe driver. As it was, when building them as a module INET
and INET6 are not defined. In these drivers it does not cause
a panic, however it does result in different behavior in the
ioctl routine when you are using a module vs static, and I
think the behavior should be the same.
MFC after: 3 days
This is a port of NetBSD's GSoC 2012 Ext3 HTree directory indexing
by Vyacheslav Matyushin. It was cleaned up and enhanced for FreeBSD
by Zheng Liu (lz@).
This is an excellent example of work shared among different projects:
Vyacheslav was able to look at an early prototype from Zheng Liu who
was also able to check the code from Haiku (with permission).
As in linux, the feature is not available by default and must be
enabled explicitly with tune2fs. We still do not support the
workarounds required in readdir for NFS.
Submitted by: Zheng Liu
Tested by: Mike Ma
Sponsored by: Google Inc.
MFC after: 1 week
algorithm, which is based on the 2011 v0.1 patch release and described in the
paper "Revisiting TCP Congestion Control using Delay Gradients" by David Hayes
and Grenville Armitage. It is implemented as a kernel module compatible with the
modular congestion control framework.
CDG is a hybrid congestion control algorithm which reacts to both packet loss
and inferred queuing delay. It attempts to operate as a delay-based algorithm
where possible, but utilises heuristics to detect loss-based TCP cross traffic
and will compete effectively as required. CDG is therefore incrementally
deployable and suitable for use on shared networks.
In collaboration with: David Hayes <david.hayes at ieee.org> and
Grenville Armitage <garmitage at swin edu au>
MFC after: 4 days
Sponsored by: Cisco University Research Program and FreeBSD Foundation
- Reconnect with some minor modifications, in particular now selsocket()
internals are adapted to use sbintime units after recent'ish calloutng
switch.
same as top-level target name for "device runfw" kernel option and
caused cyclic dependancy that lead to kernel build breakage
Module change is not strictly required and done for name unification sake
PR: conf/175751
Submitted by: Issei <i10a at herbmint.jp>
(which should be a PCIE Gen 3 slot for this adapter) by looking back thru the PCI
parent devices to the slot device.
The fix above also corrects the bandwidth display to GT/s rather than the
incorrect Gb/s
Next, allow the use of ALTQ if you select the compile option IXGBE_LEGACY_TX.
Allow the use of 'unsupported' optic modules by a compile option as well.
Add a phy reset capability into the stop code, this is so a static configured
driver will still behave properly when taken down (not being able to unload it).
This revision synchronizes the shared code with Intel internal current code,
and note that it now includes DCB supporting code, this was necessitated by
some internal changes with the code, but it also will provide the opportunity
to develop this feature in the core driver down the road.
I have edited the README to get rid of some of the worse anachronisms in it
as well, its by no means as robust as I might wish at this point however.
Oh, I also have included some conditional stuff in the code so it will be
compatible in both the 9.X and 10 environments.
Performance has been a focus in recent changes and I believe this revision
driver will perform very well in most workloads.
MFC after: 2 weeks
The AR9485 chip and AR933x SoC both implement LNA diversity.
There are a few extra things that need to happen before this can be
flipped on for those chips (mostly to do with setting up the different
bias values and LNA1/LNA2 RSSI differences) but the first stage is
putting this code into the driver layer so it can be reused.
This has the added benefit of making it easier to expose configuration
options and diagnostic information via the ioctl API. That's not yet
being done but it sure would be nice to do so.
Tested:
* AR9285, with LNA diversity enabled
* AR9285, with LNA diversity disabled in EEPROM
Realtek RTL8188CU/RTL8192CU USB IEEE 802.11b/g/n wireless cards.
This driver requires microcode which is available in FreeBSD ports:
net/urtwn-firmware-kmod.
Hiren ported the urtwn(4) man page from OpenBSD and Glen just commited a port
for the firmware.
TODO:
- 802.11n support
- Stability fixes - the driver can sustain lots of traffic but has trouble
coping with simultaneous iperf sessions.
- fix debugging
MFC after: 2 months
Tested by: kevlo, hiren, gjb
for the WB195 combo NIC - an AR9285 w/ an AR3011 USB bluetooth NIC.
The AR3011 is wired up using a 3-wire coexistence scheme to the AR9285.
The code in if_ath_btcoex.c sets up the initial hardware mapping
and coexistence configuration. There's nothing special about it -
it's static; it doesn't try to configure bluetooth / MAC traffic priorities
or try to figure out what's actually going on. It's enough to stop basic
bluetooth traffic from causing traffic stalls and diassociation from
the wireless network.
To use this code, you must have the above NIC. No, it won't work
for the AR9287+AR3012, nor the AR9485, AR9462 or AR955x combo cards.
Then you set a kernel hint before boot or before kldload, where 'X'
is the unit number of your AR9285 NIC:
# kenv hint.ath.X.btcoex_profile=wb195
This will then appear in your boot messages:
[100482] athX: Enabling WB195 BTCOEX
This code is going to evolve pretty quickly (well, depending upon my
spare time) so don't assume the btcoex API is going to stay stable.
In order to use the bluetooth side, you must also load in firmware using
ath3kfw and the binary firmware file (ath3k-1.fw in my case.)
Tested:
* AR9280, no interference
* WB195 - AR9285 + AR3011 combo; STA mode; basic bluetooth inquiries
were enough to cause traffic stalls and disassociations. This has
stopped with the btcoex profile code.
TODO:
* Importantly - the AR9285 needs ASPM disabled if bluetooth coexistence
is enabled. No, I don't know why. It's likely some kind of bug to do
with the AR3011 sending bluetooth coexistence signals whilst the device
is asleep. Since we don't actually sleep the MAC just yet, it shouldn't
be a problem. That said, to be totally correct:
+ ASPM should be disabled - upon attach and wakeup
+ The PCIe powersave HAL code should never be called
Look at what the ath9k driver does for inspiration.
* Add WB197 (AR9287+AR3012) support
* Add support for the AR9485, which is another combo like the AR9285
* The later NICs have a different signaling mechanism between the MAC
and the bluetooth device; I haven't even begun to experiment with
making that HAL code work. But it should be a lot more automatic.
* The hardware can do much more interesting traffic weighting with
bluetooth and wifi traffic. None of this is currently used.
Ideally someone would code up something to watch the bluetooth traffic
GPIO (via an interrupt) and then watch it go high/low; then figure out
what the bluetooth traffic is and adjust things appropriately.
* If I get the time I may add in some code to at least track this stuff
and expose statistics. But it's up to someone else to experiment with
the bluetooth coexistence support and add the interesting stuff (like
"real" detection of bulk, audio, etc bluetooth traffic patterns and
change wifi parameters appropriately - eg, maximum aggregate length,
transmit power, using quiet time to control TX duty cycle, etc.)
seven arguments.
The original test uses Solaris' uadmin system call to trigger the test
probe; this change adds a sysctl to the dtrace_test module and gets the test
program to trigger the test probe via the sysctl handler.
The test is currently failing on amd64 because of some bugs in the way that
probe arguments beyond the first five are obtained - these bugs will be
fixed in a separate change.
QLogic 8300 Series Adapters
Submitted by: David C Somayajulu (davidcs@freebsd.org) QLogic Corporation
Approved by: George Neville-Neil (gnn@freebsd.org)
The NTB allows you to connect two systems with this device using a PCI-e
link. The driver is made of two modules:
- ntb_hw which is a basic hardware abstraction layer for the device.
- if_ntb which implements the ntb network device and the communication
protocol.
The driver is limited at the moment to CPU memcpy instead of using DMA, and
only Back-to-Back mode is supported. Also the network device isn't full
featured yet. These changes will be coming soon. The DMA change will also
bring in the ioat driver from the project branch it is on now.
This is an initial port of the GPL/BSD Linux driver contributed by Jon Mason
from Intel. Any bugs are my contributions.
Sponsored by: Intel
Reviewed by: jimharris, joel (man page only)
Approved by: jimharris (mentor)
it will work with either the old or new server.
The FHA code keeps a cache of currently active file handles for
NFSv2 and v3 requests, so that read and write requests for the same
file are directed to the same group of threads (reads) or thread
(writes). It does not currently work for NFSv4 requests. They are
more complex, and will take more work to support.
This improves read-ahead performance, especially with ZFS, if the
FHA tuning parameters are configured appropriately. Without the
FHA code, concurrent reads that are part of a sequential read from
a file will be directed to separate NFS threads. This has the
effect of confusing the ZFS zfetch (prefetch) code and makes
sequential reads significantly slower with clients like Linux that
do a lot of prefetching.
The FHA code has also been updated to direct write requests to nearby
file offsets to the same thread in the same way it batches reads,
and the FHA code will now also send writes to multiple threads when
needed.
This improves sequential write performance in ZFS, because writes
to a file are now more ordered. Since NFS writes (generally
less than 64K) are smaller than the typical ZFS record size
(usually 128K), out of order NFS writes to the same block can
trigger a read in ZFS. Sending them down the same thread increases
the odds of their being in order.
In order for multiple write threads per file in the FHA code to be
useful, writes in the NFS server have been changed to use a LK_SHARED
vnode lock, and upgrade that to LK_EXCLUSIVE if the filesystem
doesn't allow multiple writers to a file at once. ZFS is currently
the only filesystem that allows multiple writers to a file, because
it has internal file range locking. This change does not affect the
NFSv4 code.
This improves random write performance to a single file in ZFS, since
we can now have multiple writers inside ZFS at one time.
I have changed the default tuning parameters to a 22 bit (4MB)
window size (from 256K) and unlimited commands per thread as a
result of my benchmarking with ZFS.
The FHA code has been updated to allow configuring the tuning
parameters from loader tunable variables in addition to sysctl
variables. The read offset window calculation has been slightly
modified as well. Instead of having separate bins, each file
handle has a rolling window of bin_shift size. This minimizes
glitches in throughput when shifting from one bin to another.
sys/conf/files:
Add nfs_fha_new.c and nfs_fha_old.c. Compile nfs_fha.c
when either the old or the new NFS server is built.
sys/fs/nfs/nfsport.h,
sys/fs/nfs/nfs_commonport.c:
Bring in changes from Rick Macklem to newnfs_realign that
allow it to operate in blocking (M_WAITOK) or non-blocking
(M_NOWAIT) mode.
sys/fs/nfs/nfs_commonsubs.c,
sys/fs/nfs/nfs_var.h:
Bring in a change from Rick Macklem to allow telling
nfsm_dissect() whether or not to wait for mallocs.
sys/fs/nfs/nfsm_subs.h:
Bring in changes from Rick Macklem to create a new
nfsm_dissect_nonblock() inline function and
NFSM_DISSECT_NONBLOCK() macro.
sys/fs/nfs/nfs_commonkrpc.c,
sys/fs/nfsclient/nfs_clkrpc.c:
Add the malloc wait flag to a newnfs_realign() call.
sys/fs/nfsserver/nfs_nfsdkrpc.c:
Setup the new NFS server's RPC thread pool so that it will
call the FHA code.
Add the malloc flag argument to newnfs_realign().
Unstaticize newnfs_nfsv3_procid[] so that we can use it in
the FHA code.
sys/fs/nfsserver/nfs_nfsdsocket.c:
In nfsrvd_dorpc(), add NFSPROC_WRITE to the list of RPC types
that use the LK_SHARED lock type.
sys/fs/nfsserver/nfs_nfsdport.c:
In nfsd_fhtovp(), if we're starting a write, check to see
whether the underlying filesystem supports shared writes.
If not, upgrade the lock type from LK_SHARED to LK_EXCLUSIVE.
sys/nfsserver/nfs_fha.c:
Remove all code that is specific to the NFS server
implementation. Anything that is server-specific is now
accessed through a callback supplied by that server's FHA
shim in the new softc.
There are now separate sysctls and tunables for the FHA
implementations for the old and new NFS servers. The new
NFS server has its tunables under vfs.nfsd.fha, the old
NFS server's tunables are under vfs.nfsrv.fha as before.
In fha_extract_info(), use callouts for all server-specific
code. Getting file handles and offsets is now done in the
individual server's shim module.
In fha_hash_entry_choose_thread(), change the way we decide
whether two reads are in proximity to each other.
Previously, the calculation was a simple shift operation to
see whether the offsets were in the same power of 2 bucket.
The issue was that there would be a bucket (and therefore
thread) transition, even if the reads were in close
proximity. When there is a thread transition, reads wind
up going somewhat out of order, and ZFS gets confused.
The new calculation simply tries to see whether the offsets
are within 1 << bin_shift of each other. If they are, the
reads will be sent to the same thread.
The effect of this change is that for sequential reads, if
the client doesn't exceed the max_reqs_per_nfsd parameter
and the bin_shift is set to a reasonable value (22, or
4MB works well in my tests), the reads in any sequential
stream will largely be confined to a single thread.
Change fha_assign() so that it takes a softc argument. It
is now called from the individual server's shim code, which
will pass in the softc.
Change fhe_stats_sysctl() so that it takes a softc
parameter. It is now called from the individual server's
shim code. Add the current offset to the list of things
printed out about each active thread.
Change the num_reads and num_writes counters in the
fha_hash_entry structure to 32-bit values, and rename them
num_rw and num_exclusive, respectively, to reflect their
changed usage.
Add an enable sysctl and tunable that allows the user to
disable the FHA code (when vfs.XXX.fha.enable = 0). This
is useful for before/after performance comparisons.
nfs_fha.h:
Move most structure definitions out of nfs_fha.c and into
the header file, so that the individual server shims can
see them.
Change the default bin_shift to 22 (4MB) instead of 18
(256K). Allow unlimited commands per thread.
sys/nfsserver/nfs_fha_old.c,
sys/nfsserver/nfs_fha_old.h,
sys/fs/nfsserver/nfs_fha_new.c,
sys/fs/nfsserver/nfs_fha_new.h:
Add shims for the old and new NFS servers to interface with
the FHA code, and callbacks for the
The shims contain all of the code and definitions that are
specific to the NFS servers.
They setup the server-specific callbacks and set the server
name for the sysctl and loader tunable variables.
sys/nfsserver/nfs_srvkrpc.c:
Configure the RPC code to call fhaold_assign() instead of
fha_assign().
sys/modules/nfsd/Makefile:
Add nfs_fha.c and nfs_fha_new.c.
sys/modules/nfsserver/Makefile:
Add nfs_fha_old.c.
Reviewed by: rmacklem
Sponsored by: Spectra Logic
MFC after: 2 weeks
and kern.cam.ctl.disable tunable; those were introduced as a workaround
to make it possible to boot GENERIC on low memory machines.
With ctl(4) being built as a module and automatically loaded by ctladm(8),
this makes CTL work out of the box.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
option left but actually consumed by ada(4), so move it to opt_ada.h
and get rid of opt_ata.h.
- Fix stand-alone build of atacore(4) by adding opt_cam.h.
- Use __FBSDID.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
- Move ata_timeout() to ata-all.c so we don't need to expose both this
function and ata_cam_end_transaction() but only the former.
- Move ata_cmd2str() from ata-queue.c to ata-all.c so we can get rid of
the former.
- Add some missing prototypes.
MFC after: 3 days
most kernels before FreeBSD 9.0. Remove such modules and respective kernel
options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the
atacontrol utility and some man pages. Remove useless now options ATA_CAM.
No objections: current@, stable@
MFC after: never
Merge change from illumos:
1368 enablings on defunct providers prevent providers from unregistering
We try to address some underlying differences between the Solaris
and FreeBSD implementations: dtrace_attach() / dtrace_detach() are
currently unimplemented in FreeBSD but the new code from illumos
makes use of taskq so some adaptations were made to dtrace_open()
and dtrace_close() to handle them appropriately.
Illumos Revision: r13430:8e6add739e38
Reference:
https://www.illumos.org/issues/1368
Reviewed by: gnn
Tested by: Fabian Keil
Obtained from: Illumos
MFC after: 3 weeks
includes support for the NIC and TOE features of the 40G, 10G, and
1G/100M cards based on the T5.
The ASIC is mostly backward compatible with the Terminator 4 so cxgbe(4)
has been updated instead of writing a brand new driver. T5 cards will
show up as cxl (short for cxlgb) ports attached to the t5nex bus driver.
Sponsored by: Chelsio
the older if_start/non-multiqueue interface from the stack. This
is not the default, but can be turned on in the Makefile now regardless
of the OS level to allow either testing or use of ALTQ.
MFC after: one week
much of which is not necessary for PowerPC.
The FBT module can likely be factored into 3 separate files: common,
intel, and powerpc, rather than duplicating most of the code between
the x86 and PowerPC flavors.
All DTrace modules for PowerPC will be MFC'd together once Fasttrap is
completed.
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.
The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
- VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
- VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
- VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
- VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
(in order to avoid visibility of implementation details)
- The read-mode operations are added:
VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
sys/mutex.h in consumers directly to cater its inlining functions
using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
the compat layer because the name clash between FreeBSD and solaris
versions must be avoided.
At this purpose zfs redefines the vm_object locking functions
directly, isolating the FreeBSD components in specific compat stubs.
The KPI results heavilly broken by this commit. Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).
Sponsored by: EMC / Isilon storage division
Reviewed by: jeff
Reviewed by: pjd (ZFS specific review)
Discussed with: alc
Tested by: pho
The early commit is done to facilitate the off-tree work on the
porting of the Radeon driver.
Sponsored by: The FreeBSD Foundation
Debugged and tested by: dumbbell
MFC after: 1 month
from the tree since few months (please note that the userland bits
were already disconnected since a long time, thus there is no need
to update the OLD* entries).
This is not targeted for MFC.
- Add support for IPv6 rx csum offload
- Finally switch mxge from using its own driver lro, to
using tcp_lro
MFC after: 7 days
Sponsored by: Myricom Inc.
The "blackhole" driver was used in conjunction with bhyve to sequester
pci devices intended for passthru until vmm.ko was loaded. This was
useful at one point because vmm.ko could not be loaded at boot time.
The same functionality can now be achieved by loading vmm.ko via the
loader along with the kernel.
Discussed with: grehan
Obtained from: NetApp
generating binary diffs.
- Constify a few strings used in the driver.
- Style changes to make the driver compile with default clang settings.
Approved by: HighPoint Technologies
MFC after: 3 days
- Add full support for IPv6 addresses.
- Read the size of the L2 table during attach. Do not assume that PCIe
physical function 4 of the card has all of the table to itself.
- Use FNV instead of Jenkins to hash L3 addresses and drop the private
copy of jhash.h from the driver.
MFC after: 1 week
- In sys/ofed/drivers/infiniband/core/cma.c, an enum struct member is
interpreted as an int, so cast it to an int.
- In sys/ofed/drivers/infiniband/core/ud_header.c, initialize the
packet_length variable in ib_ud_header_init(), to prevent undefined
behaviour.
- In sys/ofed/drivers/infiniband/ulp/sdp/sdp_rx.c, call rdma_notify()
with the correct enum type and value.
- In sys/ofed/include/linux/pci.h, change the PCI_DEVICE and PCI_VDEVICE
macros to use C99 struct initializers, so additional members can be
overridden.
Reviewed by: delphij, Garrett Cooper <yanegomi@gmail.com>
MFC after: 1 week
There is one known issue: Some probes will display an error message along the
lines of: "Invalid address (0)"
I tested this with both a simple dtrace probe and dtruss on a few different
binaries on 32-bit. I only compiled 64-bit, did not run it, but I don't expect
problems without the modules loaded. Volunteers are welcome.
MFC after: 1 month
warns about unused variables in this code, so always add -Wno-unused to
the warning flags. Why gcc on x86 *doesn't* warn about this, I will never
know. The code itself should probably be fixed at some point.
GIANT from VFS. In addition, disconnect also netsmb, which is a base
requirement for SMBFS.
In the while SMBFS regular users can use FUSE interface and smbnetfs
port to work with their SMBFS partitions.
Also, there are ongoing efforts by vendor to support in-kernel smbfs,
so there are good chances that it will get relinked once properly locked.
This is not targeted for MFC.
GIANT from VFS. This code is particulary broken and fragile and other
in-kernel implementations around, found in other operating systems,
don't really seem clean and solid enough to be imported at all.
If someone wants to reconsider in-kernel NTFS implementation for
inclusion again, a fair effort for completely fixing and cleaning it
up is expected.
In the while NTFS regular users can use FUSE interface and ntfs-3g
port to work with their NTFS partitions.
This is not targeted for MFC.
GIANT from VFS. In addition, disconnect also netncp, which is a base
requirement for NWFS.
In the possibility of a future maintenance of the code and later
readd to the FreeBSD base, maybe we should think about a better location
for netncp. I'm not entirely sure the / top location is actually right,
however I will let network people to comment on that more specifically.
This is not targeted for MFC.
sdchi encapsulates a generic SD Host Controller logic that relies on
actual hardware driver for register access.
sdhci_pci implements driver for PCI SDHC controllers using new SDHCI
interface
No kernel config modifications are required, but if you load sdhc
as a module you must switch to sdhci_pci instead.
This has been developed during 2 summer of code mandates and being revived
by gnn recently.
The functionality in this commit mirrors entirely content of fusefs-kmod
port, which doesn't need to be installed anymore for -CURRENT setups.
In order to get some sparse technical notes, please refer to:
http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html
or to the project branch:
svn://svn.freebsd.org/base/projects/fuse/
which also contains granular history of changes happened during port
refinements. This commit does not came from the branch reintegration
itself because it seems svn is not behaving properly for this functionaly
at the moment.
Partly Sponsored by: Google, Summer of Code program 2005, 2011
Originally submitted by: ilya, Csaba Henk <csaba-ml AT creo DOT hu >
In collabouration with: pho
Tested by: flo, gnn, Gustau Perez,
Kevin Oberman <rkoberman AT gmail DOT com>
MFC after: 2 months
The code builds a map of regions that were freed. On every write the
code consults the map and eventually removes ranges that were freed
before, but are now overwritten.
Freed blocks are not TRIMed immediately. There is a tunable that defines
how many txg we should wait with TRIMming freed blocks (64 by default).
There is a low priority thread that TRIMs ranges when the time comes.
During TRIM we keep in-flight ranges on a list to detect colliding
writes - we have to delay writes that collide with in-flight TRIMs in
case something will be reordered and write will reached the disk before
the TRIM. We don't have to do the same for in-flight writes, as
colliding writes just remove ranges to TRIM.
Sponsored by: multiplay.co.uk
This work includes some important fixes and some improvements obtained
from the zfsonlinux project, including TRIMming entire vdevs on pool
create/add/attach and on pool import for spare and cache vdevs.
Obtained from: zfsonlinux
Submitted by: Etienne Dechamps <etienne.dechamps@ovh.net>
reside, and move there ipfw(4) and pf(4).
o Move most modified parts of pf out of contrib.
Actual movements:
sys/contrib/pf/net/*.c -> sys/netpfil/pf/
sys/contrib/pf/net/*.h -> sys/net/
contrib/pf/pfctl/*.c -> sbin/pfctl
contrib/pf/pfctl/*.h -> sbin/pfctl
contrib/pf/pfctl/pfctl.8 -> sbin/pfctl
contrib/pf/pfctl/*.4 -> share/man/man4
contrib/pf/pfctl/*.5 -> share/man/man5
sys/netinet/ipfw -> sys/netpfil/ipfw
The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.
Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.
The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.
Discussed with: bz, luigi
generator, found on IvyBridge and supposedly later CPUs, accessible
with RDRAND instruction.
From the Intel whitepapers and articles about Bull Mountain, it seems
that we do not need to perform post-processing of RDRAND results, like
AES-encryption of the data with random IV and keys, which was done for
Padlock. Intel claims that sanitization is performed in hardware.
Make both Padlock and Bull Mountain random generators support code
covered by kernel config options, for the benefit of people who prefer
minimal kernels. Also add the tunables to disable hardware generator
even if detected.
Reviewed by: markm, secteam (simon)
Tested by: bapt, Michael Moll <kvedulv@kvedulv.de>
MFC after: 3 weeks
warnings in sys/gnu/fs/xfs. The only warnings that still need to be
suppressed are those about array bound overruns of flexible array
members in xfs_dir2_{block,sf}.c, which are too expensive (in terms of
cascading code changes) to fix.
MFC after: 1 week
X-MFC-With: r239959
linking it statically into the kernel. With our gcc in base there are
no warnings, so also remove the WERROR= from the module makefile.
Noted by: Eir Nym <eirnym@gmail.com>
MFC after: 1 week
Basically, this is automatic rx zero copy when feasible. TCP payload is
DMA'd directly into the userspace buffer described by the uio submitted
in soreceive by an application.
- Works with sockets that are being handled by the TCP offload engine
of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an
"ifconfig +toe" on the cxgbe interface).
- Does not require any modification to the application.
- Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.
'encapsulating interface' used with IPsec and has nothing to do with
storage 'enclosure' services.
MFC after: 3 days
Noticed while: debugging why enc(4) is no longer automatically created
subdevice ahciem. Emulate SEMB SES device from AHCI LED interface to expose
it to users in form of ses(4) CAM device. If we ever see AHCI controllers
supporting SES of SAF-TE over I2C as described by specification, they should
fit well into this new picture.
Sponsored by: iXsystems, Inc.
* Introduce TX DMA setup/teardown methods, mirroring what's done in
the RX path.
Although the TX DMA descriptor is setup via ath_desc_alloc() /
ath_desc_free(), there TX status descriptor ring will be allocated
in this path.
* Remove some of the TX EDMA capability probing from the RX path and
push it into the new TX EDMA path.
These probes are most useful when looking into the structures
they provide, which are listed in io.d. For example:
dtrace -n 'io:genunix::start { printf("%d\n", args[0]->bio_bcount); }'
Note that the I/O systems in FreeBSD and Solaris/Illumos are sufficiently
different that there is not a 1:1 mapping from scripts that work
with one to the other.
MFC after: 1 month
shared code update and small changes in core required
Add support for new i210/i211 devices
Improve queue calculation based on mac type
MFC after:5 days
Asus laptops. It is alike to acpi_asus(4), but uses WMI interface instead
of separate ACPI device.
On Asus EeePC T101MT netbook it allows to handle hotkeys and on/off WLAN,
Bluetooth, LCD backlight, camera, cardreader and touchpad.
On Asus UX31A ultrabook it allows to handle hotkeys, on/off WLAN, Bluetooth,
Wireless LED, control keyboard backlight brightness, monitor temperature
and fan speed. LCD brightness control doesn't work now for unknown reason,
possibly requiring some video card initialization.
Sponsored by: iXsystems, Inc.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
These are available as t3_tom and t4_tom modules that augment cxgb(4)
and cxgbe(4) respectively. The cxgb/cxgbe drivers continue to work as
usual with or without these extra features.
- iWARP driver for Terminator 3 ASIC (kernel verbs). T4 iWARP in the
works and will follow soon.
Build-tested with make universe.
30s overview
============
What interfaces support TCP offload? Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE
Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe
Which connections are offloaded? Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe
Reviewed by: bz, gnn
Sponsored by: Chelsio communications.
MFC after: ~3 months (after 9.1, and after ensuring MFC is feasible)
Add TSO6 and LRO/IPv6 support.
Fix the module Makefile to at least properly inlcude opt_inet6.h
and allow builds without INET or INET6.
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
Allow LRO to work on IPv6 as well.
Fix the module Makefile to at least properly inlcude opt_inet6.h
and allow builds without INET or INET6.
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
Revamp the CAM enclosure services driver.
This updated driver uses an in-kernel daemon to track state changes and
publishes physical path location information\for disk elements into the
CAM device database.
Sponsored by: Spectra Logic Corporation
Sponsored by: iXsystems, Inc.
Submitted by: gibbs, will, mav
operations required by GEMified i915.ko. It also attaches to SandyBridge
and IvyBridge CPU northbridges now.
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
There's some TX path TDMA code in if_ath_tx.c which should be migrated
out, but first I should likely try and verify/fix/repair the TDMA support
in 9.x and -HEAD.
The NAND Flash environment consists of several distinct components:
- NAND framework (drivers harness for NAND controllers and NAND chips)
- NAND simulator (NANDsim)
- NAND file system (NAND FS)
- Companion tools and utilities
- Documentation (manual pages)
This work is still experimental. Please use with caution.
Obtained from: Semihalf
Supported by: FreeBSD Foundation, Juniper Networks
defined by the SNIA Common RAID Disk Data Format Specification v2.0.
Supports multiple volumes per array and multiple partitions per disk.
Supports standard big-endian and Adaptec's little-endian byte ordering.
Supports all single-layer RAID levels. Dual-layer RAID levels except
RAID10 are not supported now because of GEOM RAID design limitations.
Some work is still to be done, but the present code already manages basic
interoperation with RAID BIOS of the Adaptec 1430SA SATA RAID controller.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
- When switching to 4-bit operation, send a SET_CLR_CARD_DETECT command
to disconnect the card-detect pull-up resistor from the DAT3 line before
sending the SET_BUS_WIDTH command.
- Add the missing "reserved" zero entry to the mantissa table used to
decode various CSD fields. This was causing SD cards to report that they
could run at 30 MHz instead of the maximum 25 MHz mandated in the spec.
o Enhancements:
- At the MMC layer, format various info from the CID into a string that
uniquely identifies the card instance (manufacturer number, serial
number, product name and revision, etc). Export it as an instance
variable.
- At the MMCSD layer, display the formatted card ID string, and also
report the clock speed of the hardware (not the card's max speed), and
the number of bits and number of blocks per transfer. It comes out like
this now:
mmcsd0: 968MB <SD SD01G 8.0 SN 276886905 MFG 08/2008 by 3 SD> at mmc0
22.5MHz/4bit/128-block
o Use DEVMETHOD_END.
o Use NULL instead of 0 for pointers.
PR: 156496
Submitted by: Ian Lepore
MFC after: 1 week
- Mark 'sdp' as requiring 'inet'.
- Always include "opt_inet.h" and "opt_inet6.h" and modify the IB
driver Makefiles to honor WITH/WITHOUT_INET/INET6/_SUPPORT options
to determine what should be enabled during a module build.
- Fix the mlxen(4) driver and the core IB code to compile without
if INET is disabled (including when both INET and INET6 are disabled).
Reviewed by: bz
MFC after: 2 weeks
First cut of new HW support from LSI and merge into FreeBSD.
Supports Drake Skinny and ThunderBolt cards.
MFhead_mfi r227574
Style
MFhead_mfi r227579
Use bus_addr_t instead of uintXX_t.
MFhead_mfi r227580
MSI support
MFhead_mfi r227612
More bus_addr_t and remove "#ifdef __amd64__".
MFhead_mfi r227905
Improved timeout support from Scott.
MFhead_mfi r228108
Make file.
MFhead_mfi r228208
Fixed botched merge of Skinny support and enhanced handling
in call back routine.
MFhead_mfi r228279
Remove superfluous !TAILQ_EMPTY() checks before TAILQ_FOREACH().
MFhead_mfi r228310
Move mfi_decode_evt() to taskqueue.
MFhead_mfi r228320
Implement MFI_DEBUG for 64bit S/G lists.
MFhead_mfi r231988
Restore structure layout by reverting the array header to
use [0] instead of [1].
MFhead_mfi r232412
Put wildcard pattern later in the match table.
MFhead_mfi r232413
Use lower case for hexadecimal numbers to match surrounding
style.
MFhead_mfi r232414
Add more Thunderbolt variants.
MFhead_mfi r232888
Don't act on events prior to boot or when shutting down.
Add hw.mfi.detect_jbod_change to enable or disable acting
on JBOD type of disks being added on insert and removed on
removing. Switch hw.mfi.msi to 1 by default since it works
better on newer cards.
MFhead_mfi r233016
Release driver lock before taking Giant when deleting children.
Use TAILQ_FOREACH_SAFE when items can be deleted. Make code a
little simplier to follow. Fix a couple more style issues.
MFhead_mfi r233620
Update mfi_spare/mfi_array with the actual number of elements
for array_ref and pd. Change these max. #define names to avoid
name space collisions. This will require an update to mfiutil
It avoids mfiutil having to do a magic calculation.
Add a note and #define to state that a "SYSTEM" disk is really
what the firmware calls a "JBOD" drive.
Thanks to the many that helped, LSI for the initial code drop,
mav, delphij, jhb, sbruno that all helped with code and testing.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
Sponsored by: NETASQ
MFC after: 1 month
sys/dev/mps/mps_sas.c:861:1: error: function 'mpssas_discovery_timeout' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
mpssas_discovery_timeout(void *data)
^
Because the driver is obtained from upstream, we don't want to modify
it; just silence the warning instead, it is harmless.
MFC after: 3 days
Winbond Super I/O chips.
With minor efforts it should be possible the extend the driver to support
further chips/revisions available from Winbond. In the simplest case
only new IDs need to be added, while different chipsets might require
their own function to enter extended function mode, etc.
Sponsored by: Sandvine Incorporated ULC (in 2011)
Reviewed by: emaste, brueffer
MFC after: 2 weeks
get rid of testing explicitly for clang (using ${CC:T:Mclang}) in
individual Makefiles.
Instead, use the following extra macros, for use with clang:
- NO_WERROR.clang (disables -Werror)
- NO_WCAST_ALIGN.clang (disables -Wcast-align)
- NO_WFORMAT.clang (disables -Wformat and friends)
- CLANG_NO_IAS (disables integrated assembler)
- CLANG_OPT_SMALL (adds flags for extra small size optimizations)
As a side effect, this enables setting CC/CXX/CPP in src.conf instead of
make.conf! For clang, use the following:
CC=clang
CXX=clang++
CPP=clang-cpp
MFC after: 2 weeks
sys/dev/hpt27xx/osm_bsd.c, since it gets the following warnings:
sys/dev/hpt27xx/osm_bsd.c:1180:25: error: format string is not a string literal (potentially insecure) [-Werror,-Wformat-security]
S_IRUSR | S_IWUSR, driver_name);
^~~~~~~~~~~
@/dev/hpt27xx/hpt27xx_config.h:46:21: note: expanded from:
#define driver_name hpt27xx_driver_name
^~~~~~~~~~~~~~~~~~~
Since 'hpt27xx_driver_name' is a constant string symbol (coming from the
proprietary hpt27xx_lib.o file), there is no security problem.
Because this driver is provided by the vendor, and applying changes
requires re-certification and other bureaucratic exercises, just disable
the warning for now.
MFC after: 1 week
with clang. Also fix a number of warnings uncovered when building with
clang around some implicit enum conversions.
Sponsored by: Intel
Approved by: scottl
kernel modules that include binary-only code.
More fine-grained control is provided via MK_SOURCELESS_HOST (for native code
that runs on host CPU) and MK_SOURCELESS_UCODE (for microcode).
Reviewed by: julian, delphij, freebsd-arch
Approved by: kib (mentor)
MFC after: 2 weeks
The isci driver is for the integrated SAS controller in the Intel C600
(Patsburg) chipset. Source files in sys/dev/isci directory are
FreeBSD-specific, and sys/dev/isci/scil subdirectory contains
an OS-agnostic library (SCIL) published by Intel to control the SAS
controller. This library is used primarily as-is in this driver, with
some post-processing to better integrate into the kernel build
environment.
isci.4 and a README in the sys/dev/isci directory contain a few
additional details.
This driver is only built for amd64 and i386 targets.
Sponsored by: Intel
Reviewed by: scottl
Approved by: scottl
This involves significant changes to the mps(4) driver, but is not a
complete rewrite.
Some of the changes in this version of the driver:
- Integrated RAID (IR) support.
- Support for WarpDrive controllers.
- Support for SCSI protection information (EEDP).
- Support for TLR (Transport Level Retries), needed for tape drives.
- Improved error recovery code.
- ioctl interface compatible with LSI utilities.
mps.4: Update the mps(4) driver man page somewhat for the driver
changes. The list of supported hardware still needs to be
updated to reflect the full list of supported cards.
conf/files: Add the new driver files.
mps/mpi/*: Updated version of the MPI header files, with a BSD style
copyright.
mps/*: See above for a description of the new driver features.
modules/mps/Makefile:
Add the new mps(4) driver files.
Submitted by: Kashyap Desai <Kashyap.Desai@lsi.com>
Reviewed by: ken
MFC after: 1 week
versions derived from /usr/ports/audio/oss.
The particular headers used were taken from the
attic/drv/oss_allegro directory and are mostly identical
to the previous files.
The Maestro3 driver is now free from the GPL.
NOTE: due to lack of testers this driver is being
considered for deprecation and removal.
PR: kern/153920
Approved by: jhb (mentor)
MFC after: 2 weeks
is used.
Although the module _builds_, it fails to load because of a missing symbol from
ieee80211_tdma.c.
Specifics:
* Always build ieee80211_tdma.c in the module;
* only compile in the code if IEEE80211_SUPPORT_TDMA is defined.
The USB code as it stands includes the bus glue along _with_ the controller
code. So the ohci/ehci modules actually build the USB controller code and
the PCI bus glue.
It'd be nice to ship separate modules for the PCI glue and the USB
controller (so for example if there were a USB controller hanging off
the internal SoC bus as well as an external PCI device) it could be done.
This is primarily done to save a few bytes here and there on embedded
systems with limited flash space for kernels - a very limited (sub-1MB)
space may be available for the kernel and may only support gzip encoding.
The rootfs can be LZMA compressed.
This is primarily done to save a few bytes here and there on embedded
systems with limited flash space for kernels - a very limited (sub-1MB)
space may be available for the kernel and may only support gzip encoding.
The rootfs can be LZMA compressed.
- Huge old hdac driver was split into three independent pieces: HDA
controller driver (hdac), HDA CODEC driver (hdacc) and HDA sudio function
driver (hdaa).
- Support for multichannel recording was added. Now, as specification
defines, driver checks input associations for pins with sequence numbers
14 and 15, and if found (usually) -- works as before, mixing signals
together. If it doesn't, it configures input association as multichannel.
- Signal tracer was improved to look for cases where several DACs/ADCs in
CODEC can work with the same audio signal. If such case found, driver
registers additional playback/record stream (channel) for the pcm device.
- New controller streams reservation mechanism was implemented. That
allows to have more pcm devices then streams supported by the controller
(usually 4 in each direction). Now it limits only number of simultaneously
transferred audio streams, that is rarely reachable and properly reported
if happens.
- Codec pins and GPIO signals configuration was exported via set of
writable sysctls. Another sysctl dev.hdaa.X.reconfig allows to trigger
driver reconfiguration in run-time.
- Driver now decodes pins location and connector type names. In some cases
it allows to hint user where on the system case connectors, related to the
pcm device, are located. Number of channels supported by pcm device,
reported now (if it is not 2), should also make search easier.
- Added workaround for digital mic on some Asus laptops/netbooks.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
This uses the emuxkireg.h already used in the emu10k1
snd driver. Special thanks go to Alexander Motin as
he was able to find some errors and reverse engineer
some wrong values in the emuxkireg header.
The emu10kx driver is now free from the GPL.
PR: 153901
Tested by: mav, joel
Approved by: jhb (mentor)
MFC after: 2 weeks
This introduces:
* a basic wtap interface
* a HAL, which implements an abstraction layer for implementing
different device behavious;
* A visibility plugin, which allows for control over which nodes
see other nodes (useful for mesh work.)
It doesn't yet implement sta/adhoc/hostap modes but these are quite
feasible to implement.
Monthadar uses it to do 802.11s mesh verification.
The userland tools will be committed in a follow-up commit.
Submitted by: Monthadar Al Jaberi <monthadar@gmail.com>