set on the new thread. This prevents the thread from inadvertently
inheriting affinity from a random sibling.
Submitted by: attilio
Tested by: pho
MFC after: 1 week
kernel modules that include binary-only code.
More fine-grained control is provided via MK_SOURCELESS_HOST (for native code
that runs on host CPU) and MK_SOURCELESS_UCODE (for microcode).
Reviewed by: julian, delphij, freebsd-arch
Approved by: kib (mentor)
MFC after: 2 weeks
in r179783 as (ab)using the concept of VRFs for this had not worked.
At this point SCTP in FreeBSD does not support multi-FIB, neither for
IPv4 nor for IPv6.
Discussed with: rrs
Sponsored by: Cisco Systems, Inc.
Try to make the "rtable" handling work but the current version of
pf(4) does not fully support it yet as especially callers of
PF_MISMATCHAW() are not fully FIB-aware. OpenBSD seems to have
fixed this in a later version. Prepare as much as possible.
Sponsored by: Cisco Systems, Inc.
the original IPv4 implementation from r178888:
- Use RT_DEFAULT_FIB in the IPv4 implementation where noticed.
- Use rt*fib() KPI with explicit RT_DEFAULT_FIB where applicable in
the NFS code.
- Use the new in6_rt* KPI in TCP, gif(4), and the IPv6 network stack
where applicable.
- Split in6_rtqtimo() and in6_mtutimo() as done in IPv4 and equally
prevent multiple initializations of callouts in in6_inithead().
- Use wrapper functions where needed to preserve the current KPI to
ease MFCs. Use BURN_BRIDGES to indicate expected future cleanup.
- Fix (related) comments (both technical or style).
- Convert to rtinit() where applicable and only use custom loops where
currently not possible otherwise.
- Multicast group, most neighbor discovery address actions and faith(4)
are locked to the default FIB. Individual IPv6 addresses will only
appear in the default FIB, however redirect information and prefixes
of connected subnets are automatically propagated to all FIBs by
default (mimicking IPv4 behavior as closely as possible).
Sponsored by: Cisco Systems, Inc.
the (maximum) number of FIBs trying to clarify that evetually FIBs
should probably attached to domain(9) specific storage. [1]
Add a comment on a limitimation on the rt_add_addr_allfibs option.
Use RT_DEFAULT_FIB instead of 0 where applicable.
Add empty line to functions without local variables per style.
Put public yet unused in-tree function rtinit_fib() under BURN_BRIDGES
to indicate that it might go away in the future.
No functional change.
Discussed with: julian [1] (clarification on what the original one meant)
Sponsored by: Cisco Systems, Inc.
While doing so, for consistency with the rtalloc_ign_fib(9) interface
called, remove the "in_" prefix from rtalloc_ign_wrapper() no longer
indicating that it would only handle the INET case.
Sponsored by: Cisco Systems, Inc.
tables (FIBs) as IPv4.
Prepare various general rt* functions for multi-FIB IPv6 handling in
addition to already existing multi-FIB IPv4 cases.
Sponsored by: Cisco Systems, Inc.
of the default one.
Without this change setting kern.cam.ada.default_timeout to 1 instead of 30
allowed me to trigger several false positive command timeouts under heavy
ZFS load on a SiI3132 siis(4) controller with 5 HDDs on a port multiplier.
MFC after: 1 week
netback.c: Add missing VM includes.
xen/xenvar.h,
xen/xenpmap.h: Move some XENHVM macros from <machine/xen/xenpmap.h> to
<machine/xen/xenvar.h> on i386 to match the amd64 headers.
conf/files: Add netback to the build.
Submitted by: jhb
MFC after: 3 days
Even having more specific hint.ata.X.mode controls, global ones are
still could be useful from some points, including compatibility.
PR: kern/164651
MFC after: 1 week
The cs driver requires a table with firmware values. An
alternative firmware is available in a similar Open Sound
System driver. This is actually a partial revert of
Revision 77504.
Special thanks to joel@ for patiently testing several
replacement attempts.
The csa driver and the complete sound system are now free
of the GPL.
Tested by: joel
Approved by: jhb (mentor)
MFC after: 3 weeks
disconnected swap device.
This is quick and imperfect solution, as swap device will still be opened
and GEOM will not be able to destroy it. Proper solution would be to
automatically turn off and close disconnected swap device, but with existing
code it will cause panic if there is at least one page on device, even if
it is unimportant page of the user-level process. It needs some work.
Reviewed by: kib@
MFC after: 1 week
channels available
- current code treats bits 4:7 in 'SATAHC interrupt mask' and 'SATAHC
interrupt cause' as flags for SATA channels 2 and 3
- for embedded SATA controllers (SoC) these bits have been marked as reserved
in datasheets so far, but for some new and upcoming chips they are used for
purposes other than SATA
Submitted by: Lukasz Plachno
Reviewed by: mav
Obtained from: Semihalf
MFC after: 2 weeks
to cleanup routes from a single ifa.
o Implement carp_addroute()/carp_delroute() via above functions.
o Call carp_ifa_delroute() in the carp_detach() to avoid
junk routes left in routing table, in case if user
removes an address in a MASTER state. [1]
Reported by: az [1]
it is possible that a single AIO event will be reported to multiple
threads, it is not threading friendly, and the existing API can not
control this behavior.
Allocate a kevent flags field sigev_notify_kevent_flags for AIO event
notification in sigevent, and allow user to pass EV_CLEAR, EV_DISPATCH
or EV_ONESHOT to AIO kernel code, user can control whether the event
should be cleared once it is retrieved by a thread. This change should
be comptaible with existing application, because the field should have
already been zero-filled, and no additional action will be taken by
kernel.
PR: kern/156567
* Override the TX/RX stream count if the EEPROM reports a single RX or
TX stream, rather than assuming the device will always be a 2x2 strea
device.
* For AR9280 devices, don't hard-code 2x2 stream. Instead, allow the
ar5416FillCapabilityInfo() routine to correctly determine things.
The latter should be done for all 11n chips now that
ar5416FillCapabilityInfo() will set the TX/RX stream count based on the
active TX/RX chainmask in the EEPROM.
Thanks to Maciej Milewski for donating some AR9281 NICs to me for
testing.
hardware imposes strict limitations on hard buffer and block sizes.
Previous code set soft buffer to be no smaller then hard buffer. On some
cards with fixed 64K physical buffer that caused up to 800ms play latency.
New code allows to set soft buffer size down to just two blocks of the hard
buffer and to not write more then that size ahead to the hardware buffer.
As result of that change I was able to reduce full practically measured
record-playback loop delay in those conditions down to only about 115ms
with theoretical playback latency of only about 50ms.
New code works fine for both vchans and direct cases. In both cases sound(4)
tries to follow hw.snd.latency_profile and hw.snd.latency values and
application-requested buffer and block sizes as much as limitation of two
hardware blocks allows.
Reviewed by: silence on multimedia@
The isci driver is for the integrated SAS controller in the Intel C600
(Patsburg) chipset. Source files in sys/dev/isci directory are
FreeBSD-specific, and sys/dev/isci/scil subdirectory contains
an OS-agnostic library (SCIL) published by Intel to control the SAS
controller. This library is used primarily as-is in this driver, with
some post-processing to better integrate into the kernel build
environment.
isci.4 and a README in the sys/dev/isci directory contain a few
additional details.
This driver is only built for amd64 and i386 targets.
Sponsored by: Intel
Reviewed by: scottl
Approved by: scottl
any thread doing an I/O RPC with a transfer size greater than
NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC.
After a discussion on freebsd-fs@, I decided to add a warning
message for this case, as suggested by Jeremy Chadwick.
Suggested by: freebsd at jdc.parodius.com (Jeremy Chadwick)
MFC after: 2 weeks
under the subject "F_RDLCK lock to FreeBSD NFS fails to R/O target file".
This occurred because the server side NLM always checked for VWRITE
access, irrespective of the type of lock request. This patch
replaces VOP_ACCESS(..VWRITE..) with one appropriate to
the lock operation. It allows unlock and lock cancellation
to be done without a check of VOP_ACCESS(), so that files
can't be left locked indefinitely after the file permissions
have been changed.
Discussed with: zack
Submitted by: jwd (earlier version)
Reviewed by: dfr
MFC after: 2 weeks
compliance testing.
In order to allow for radar pattern matching to occur, the DFS CAC/NOL
handling needs to be made configurable. This commit introduces a new
sysctl, "net.wlan.dfs_debug", which controls which DFS debug mode
net80211 is in.
* 0 = default, CSA/NOL handling as per normal.
* 1 = announce a CSA, but don't add the channel to the non-occupy list
(NOL.)
* 2 = disable both CSA and NOL - only print that a radar event occured.
This code is not compiled/enabled by default as it breaks regulatory
handling. A user must enable IEEE80211_DFS_DEBUG in their kernel
configuration file for this option to become available.
Obtained from: Atheros
* For legacy NICs, the combined RSSI should be used.
For earlier AR5416 NICs, use control chain 0 RSSI rather than combined
RSSI.
For AR5416 > version 2.1, use the combined RSSI again.
* Add in a missing AR5212 HAL method (get11nextbusy) which may be called
by radar code.
This serves no functional change for what's currently in FreeBSD.
about new child not only when doing PT_TO_SCX, but also for PT_CONTINUE.
If TDB_FORK flag is set, always issue a stop, the same as is done for
TDB_EXEC.
Reported by: Dmitry Mikulin <dmitrym juniper net>
MFC after: 1 week
that instead of using direct I/O it allows read-ahead similar to
POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the
read(2) has completed to purge just-read data. The write(2) path continues
to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works
optimally if an application reads and writes full fs blocks.
that we have the lock now. This cleans up a locking panic ASSERT when
knlist_empty is called without a lock when INVARIANTS etc. are turned.
Reviewed by: kib jhb
MFC after: 1 week
so try harder to get the CDMA sync interrupt delivered and also in
a more efficient way:
- wrap the whole process of sending and receiving the CDMA sync
interrupt in a critical section so we don't get preempted,
- send the CDMA sync interrupt to the CPU that is actually waiting
for it to happen so we don't take a detour via another CPU,
- instead of waiting for up to 15 seconds for the interrupt to
trigger try the whole process for up to 15 times using a one
second timeout (the code was also changed to just ignore belated
interrupts of a previous tries should they appear).
According to testing done by Peter Jeremy with the debugging also
added as part of this commit the first two changes apparently are
sufficient to now properly get the CDMA sync interrupts delivered
at the first try though.
VIS-based block copy/zero implementations. While with 4BSD it's
sufficient to just disable the tick interrupts, with ULE+PREEMPTION
it's otherwise also possible that these are preempted via IPIs.
* Grab the net80211com lock when calling ieee80211_dfs_notify_radar().
* Use the tsf extend function to turn the 64 bit base TSF into a per-
frame 64 bit TSF. This will improve radiotap logging (which will
now have a (more) correct per-frame TSF, rather then the single TSF64
value read at the beginning of ath_rx_proc().
primitives by breaking stop_scheduler into a per-thread variable.
Also, store the new td_stopsched very close to td_*locks members as
they will be accessed mostly in the same codepaths as td_stopsched and
this results in avoiding a further cache-line pollution, possibly.
STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to
take advantage of already cached curthread, but in the end there should
not really be a performance benefit, while introducing a KPI breakage.
In collabouration with: flo
Reviewed by: avg
MFC after: 3 months (or never)
X-MFC: r228424
helper since r230632, use these for output and panicing during the
early cycles and move cninit() until after the static per-CPU data
has been set up. This solves a couple of issue regarding the non-
availability of the static per-CPU data:
- panic() not working and only making things worse when called,
- having to supply a special DELAY() implementation to the low-level
console drivers,
- curthread accesses of mutex(9) usage in low-level console drivers
that aren't conditional due to compiler optimizations (basically,
this is the problem described in r227537 but in this case for
keyboards attached via uart(4)). [1]
PR: 164123 [1]
implementing a simple OF_panic() that may be used during the early
cycles when panic() isn't available, yet.
- Mark cpu_{exit,shutdown}() as __dead2 as appropriate.
VM_KMEM_SIZE_SCALE to 2, awaiting more insight from alc@. As it turns
out, the VM apparently has problems with machines that have large holes
in the physical address space, causing the kmem_suballoc() call in
kmeminit() to fail with a VM_KMEM_SIZE_SCALE of 1. Using a value of 2
allows these, namely Blade 1500 with 2GB of RAM, to boot.
PR: 164227
cards powering up at once. Work around the easy case (multiple cards
inserted on boot) with a short sleep and a long comment. This
improves reliability on those laptops with power hungry cards.
excluding other allocations including UMA now entails the addition of
a single flag to kmem_alloc or uma zone create
Reviewed by: alc, avg
MFC after: 2 weeks
cleared/dropped leading to qid2tap[n] being NULL as there no longer
is a tap. Now, if there have been lots of frames queued the firmware
processes and returns those after the tap is gone.
Tested by: osa
MFC after: 1 week
NFS clients was reported to freebsd-fs@ under the subject "NFS
corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when
a TCP mounted root fs was changed to using UDP. I believe that this
problem was caused by the change in mnt_stat.f_iosize that occurred
because rsize was decreased to the maximum supported by UDP. This
patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize,
since the latter is set to f_iosize when the vnode is allocated, but
does not change for a given vnode when f_iosize changes.
Reported by: pjd
Reviewed by: kib
MFC after: 2 weeks
Remove unneeded temporary variable (data) to better match the OSS code.
Remove some unused constants and type definitions.
Tested by: joel
Approved by: jhb (mentor)
MFC after: 3 weeks
referenced within its timeout window. This change clears the LLE_VALID flag when an llentry
is removed from an interface's hash table and adds an extra check to the flowtable code
for the LLE_VALID flag in llentry to avoid retaining and using a stale reference.
Reviewed by: qingli@
MFC after: 2 weeks
This involves significant changes to the mps(4) driver, but is not a
complete rewrite.
Some of the changes in this version of the driver:
- Integrated RAID (IR) support.
- Support for WarpDrive controllers.
- Support for SCSI protection information (EEDP).
- Support for TLR (Transport Level Retries), needed for tape drives.
- Improved error recovery code.
- ioctl interface compatible with LSI utilities.
mps.4: Update the mps(4) driver man page somewhat for the driver
changes. The list of supported hardware still needs to be
updated to reflect the full list of supported cards.
conf/files: Add the new driver files.
mps/mpi/*: Updated version of the MPI header files, with a BSD style
copyright.
mps/*: See above for a description of the new driver features.
modules/mps/Makefile:
Add the new mps(4) driver files.
Submitted by: Kashyap Desai <Kashyap.Desai@lsi.com>
Reviewed by: ken
MFC after: 1 week
data changes.
cam_ccb.h: Add a new advanced information type, CDAI_TYPE_RCAPLONG,
for long read capacity data.
cam_xpt_internal.h:
Add a read capacity data pointer and length to struct cam_ed.
cam_xpt.c: Free the read capacity buffer when a device goes away.
While we're here, make sure we don't leak memory for other
malloced fields in struct cam_ed.
scsi_all.c: Update the scsi_read_capacity_16() to take a uint8_t * and
a length instead of just a pointer to the parameter data
structure. This will hopefully make this function somewhat
immune to future changes in the parameter data.
scsi_all.h: Add some extra bit definitions to struct
scsi_read_capacity_data_long, and bump up the structure
size to the full size specified by SBC-3.
Change the prototype for scsi_read_capacity_16().
scsi_da.c: Register changes in read capacity data with the transport
layer. This allows the transport layer to send out an
async notification to interested parties. Update the
dasetgeom() API.
Use scsi_extract_sense_len() instead of
scsi_extract_sense().
scsi_xpt.c: Add support for the new CDAI_TYPE_RCAPLONG advanced
information type.
Make sure we set the physpath pointer to NULL after freeing
it. This allows blindly freeing it in the struct cam_ed
destructor.
sys/param.h: Bump __FreeBSD_version from 1000005 to 1000006 to make it
easier for third party drivers to determine that the read
capacity data async notification is available.
camcontrol.c,
mptutil/mpt_cam.c:
Update these for the new scsi_read_capacity_16() argument
structure.
Sponsored by: Spectra Logic
share/man/man4/Makefile,
share/man/man4/xnb.4,
sys/dev/xen/netback/netback.c,
sys/dev/xen/netback/netback_unit_tests.c:
Rewrote the netback driver for xen to attach properly via newbus
and work properly in both HVM and PVM mode (only HVM is tested).
Works with the in-tree FreeBSD netfront driver or the Windows
netfront driver from SuSE. Has not been extensively tested with
a Linux netfront driver. Does not implement LRO, TSO, or
polling. Includes unit tests that may be run through sysctl
after compiling with XNB_DEBUG defined.
sys/dev/xen/blkback/blkback.c,
sys/xen/interface/io/netif.h:
Comment elaboration.
sys/kern/uipc_mbuf.c:
Fix page fault in kernel mode when calling m_print() on a
null mbuf. Since m_print() is only used for debugging, there
are no performance concerns for extra error checking code.
sys/kern/subr_scanf.c:
Add the "hh" and "ll" width specifiers from C99 to scanf().
A few callers were already using "ll" even though scanf()
was handling it as "l".
Submitted by: Alan Somers <alans@spectralogic.com>
Submitted by: John Suykerbuyk <johns@spectralogic.com>
Sponsored by: Spectra Logic
MFC after: 1 week
Reviewed by: ken
- add "+HP" in case of headphones redirection;
- add device type for analog devices, if all pins have the same.
As result now it may look like "Analog 5.1+HP/2.0" or "Front Analog Mic".
I hope it will be more useful than long and confusing.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
casted types: to ssize_t in filesystem code and to
int in buf code, thus supplying a negative argument
leads to kernel panic later. To fix that check user
supplied argument in the beginning of syscall.
Submitted by: Maxim Dounin <mdounin mdounin.ru>, maxim@
- remove experimental code for disabling CRC
- use the correct constant for conversion between interrupt rate
and EITR values (the previous values were off by a factor of 2)
- make dev.ix.N.queueM.interrupt_rate a RW sysctl variable.
Changing individual values affects the queue immediately,
and propagates to all interfaces at the next reinit.
- add dev.ix.N.queueM.irqs rdonly sysctl, to export the actual
interrupt counts
Netmap-related changes for ixgbe:
- use the "new" format for TX descriptors in netmap mode.
- pass interrupt mitigation delays to the user process doing poll()
on a netmap file descriptor.
On the RX side this means we will not check the ring more than once
per interrupt. This gives the process a chance to sleep and process
packets in larger batches, thus reducing CPU usage.
On the TX side we take this even further: completed transmissions are
reclaimed every half ring even if the NIC interrupts more often.
This saves even more CPU without any additional tx delays.
Generic Netmap-related changes:
- align the netmap_kring to cache lines so that there is no false sharing
(possibly useful for multiqueue NICs and MSIX interrupts, which are
handled by different cores). It's a minor improvement but it does not
cost anything.
Reviewed by: Jack Vogel
Approved by: Jack Vogel
Unmounts do vfs_msync() before calling VFS_UNMOUNT(), but there is
still a race allowing a process to dirty pages after msync
finished. Remounts rw->ro just left dirty pages in system.
Reviewed by: alc, tegge (long time ago)
Tested by: pho
MFC after: 2 weeks
appropriate timestamps. Restore the assertions which verify that
NCF_TS is set when timestamp is asked for.
Reviewed by: jhb (previous version)
MFC after: 2 weeks
selection in snd_hda(4) driver.
Now driver tracks jack presence detection status for every CODEC pin. For
playback associations, when configured, that information, same as before,
can be used to automatically redirect audio to headphones. Also same as
before, these events are used to track digital display connection status
and fetch ELD. Now in addition to that driver uses that information to
automatically switch recording source of the mixer to the connected input.
When there are devices with no jack detection and with one both connected,
last ones will have the precedence. As result, on most laptops after boot
internal microphone should be automatically selected. But if external one
(for example, headset) connected, it will be selected automatically.
When external mic disconnected, internal one will be selected again.
Automatic recording source selection is enabled by default now to make
recording work out of the box without touching mixer. But it can be
disabled or limited only to attach time using hint.pcm.X.rec.autosrc loader
tunables or dev.pcm.X.rec.autosrc sysctls.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
we will only trust a positive name cache entry for a specified amount of
time before falling back to a LOOKUP RPC, even if the ctime for the file
handle matches the cached copy in the name cache entry. The timeout is
configured via a new 'nametimeo' mount option and defaults to 60 seconds.
It may be set to zero to disable positive name caching entirely.
Reviewed by: rmacklem
MFC after: 1 week
- Enter instead of ENTER
- Remove colons
- Line up option values
- Use dots to provide a line to visually connect the menu
selections with their values
- Replace Enabled/Disabled with off/On
(bigger inital cap for "On" is a visual indicator)
- Remove confusing "Boot" from selections that don't boot.
- With loader_color=1 in /boot/loader.conf, use reverse video to
highlight enabled options
PR: misc/160818
Submitted by: Warren Block <wblock wonkity com>
Reviewed by: Devin Teske <devin dot teske fisglobal com>, current@
MFC after: 1 week
in response to CAM_DEV_NOT_THERE, instead of just the LUN in question.
This will now just eliminate the specified LUN in response to
CAM_DEV_NOT_THERE.
Reported by: Richard Todd <rmtodd@servalan.servalan.com>
MFC after: 3 days
The actual ia6->ia6_lifetime access is hidden in
IFA6_IS_INVALID/IFA6_IS_DEPRECATED macros since a long time ago
(see netinet6/nd6.c, r1.104 of KAME for the reference).
MFC after: 3 days
from TCP to UDP and the rsize/wsize/readdirsize is greater
than NFS_MAXDGRAMDATA, it is possible for a thread doing an
I/O RPC to get stuck repeatedly doing retries. This happens
because the RPC will use a resize/wsize/readdirsize that won't
work for UDP and, as such, it will keep failing indefinitely.
This patch returns an error for this case, to avoid the problem.
A discussion on freebsd-fs@ seemed to indicate that returning
an error was preferable to silently ignoring the "udp"/"mntudp"
option.
This problem was discovered while investigating a problem reported
by pjd@ via email.
MFC after: 2 weeks
(HDMI and HBR bits set) and needed (AC3 format used with 8 channels).
This should allow DTS-HD/TrueHD pass-through with rates above 6.144Mbps.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
consistently, creating some namecache entries without NCF_TS flag.
This causes panic due to failed assertion.
As a temporal relief, remove the assert. Return epoch timestamp for
the entries without timestamp if asked.
While there, consolidate the code which returns timestamps, into a
helper cache_out_ts().
Discussed with: jhb
MFC after: 2 weeks
widgets. I am not sure if S/PDIF supports 32bit samples, but my Marantz
SR4001 doesn't, producing only single clicks on playback start/stop.
Because HDA controller requires 32bit alignment for all samples above 16bit,
we can't handle this situation in regular way and have to set 32bit format
in sound(4) for anything above 16bit. To workaround the problem, prefer
to setup hardware to use 24/20bit samples when 32bit format requested. Add
dev.pcm.X.play.32bit and dev.pcm.X.rec.32bit sysctls to control what format
really use for 32bit samples.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
hash with names of its hooks. It starts with size of 16, and
grows when number of hooks reaches twice the current size. A
failure to grow (memory is allocated with M_NOWAIT) isn't
fatal, however.
I used standard hash(9) function for the hash. With 25000
hooks named in the mpd (ports/net/mpd5) manner of "b%u", the
distributions is the following: 72.1% entries consist of one
element, 22.1% consist of two, 5.2% consist of three and
0.6% of four.
Speedup in a synthetic test that creates 25000 hooks and then
runs through a long cyclce dereferencing them in a random order
is over 25 times.
mutex(9) to rwlock(9) based locks.
While here remove dropping lock when processing NGM_LISTNODES,
and NGM_LISTTYPES generic commands. We don't need to drop it
since memory allocation is done with M_NOWAIT.
- retrive only one, specified limit for a process, not the whole
array, as it was previously (the sysctl has been added recently and
has not been backported to stable yet, so this change is ok);
- allow to set a resource limit for another process.
Submitted by: Andrey Zonov <andrey at zonov.org>
Discussed with: kib
Reviewed by: kib
MFC after: 2 weeks
maximal from 64K to 256K.
We usually don't need 750 sound interrupts per second (1.3ms latency)
when playing 192K/24/8 stream. 187 should be better. On usual 48K/16/2
it is just enough for hw.snd.latency=9 at hw.snd.latency_profile=1 with
23 and 6 interrupts per second.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Previous code was relatively dumb. During CODEC probe it was tracing signals
and statically binding amplifier controls to the OSS mixer controls. To set
volume it just set all bound amplifier controls proportionally to mixer
level, not looking on their hierarchy and amplification levels/offsets.
New code is much smarter. It also traces signals during probe, but mostly
to find out possible amplification control rages in dB for each specific
signal. To set volume it retraces each affected signal again and sets
amplifiers controls recursively to reach desired amplification level in dB.
It would be nice to export values in dB to user, but unluckily our OSS mixer
API is too simple for that.
As result of this change:
- cascaded amplifiers will work together to reach maximal precision.
If some input has 0/+40dB preamplifier with 10dB step and -10/+10dB mixer
with 1dB step after it, new code will use both to provide 0/+40dB control
with 1dB step! We could even get -10/+50dB range there, but that is
intentionally blocked for now.
- different channels of multichannel associations on non-uniform CODECs
such as VIA VT1708S will have the same volume, not looking that control
ranges are different. It was not good when fronts were 12dB louder.
- for multiplexed recording, when we can record from only one source at
a time, we can now use recording amplifier controls to set different
volume levels for different inputs if they have no own controls of they
are less precise. If recording source change, amplifiers will be
reconfigured.
To improve out-of-the-box behavior, ignore default volume levels set by
sound(4) and use own, more reasonable: +20dB for mics, -10dB for analog
output volume and 0dB for the rest of controls. sound(4) defaults of 75%
mean absolutely random things for different controls of different CODECs
because of very different control ranges.
Together with further planned automatic recording source selection this
should allow users to get fine playback and recording without touching
mixer first.
Note that existing users should delete /var/db/mixer*-state and reboot
or trigger CODEC reconfiguration to get new default values.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
This makes it much easier to determine whether an event occurs in the
net80211 taskqueue (which was called "ath0 taskq") or the ath driver
taskqueue (which is also called "ath0 taskq".)
comments to longer, also refining strange ones.
Properly use #ifdef rather than #if defined() where possible. Four
#if defined(PCBGROUP) occurances (netinet and netinet6) were ignored to
avoid conflicts with eventually upcoming changes for RSS.
Reported by: bde (most)
Reviewed by: bde
MFC after: 3 days
provide struct namecache_ts which is the old struct namecache. Only
allocate struct namecache_ts if non-null struct timespec *tsp was
passed to cache_enter_time, otherwise use struct namecache.
Change struct namecache allocation and deallocation macros into static
functions, since logic becomes somewhat twisty. Provide accessor for
the nc_name member of struct namecache to hide difference between
struct namecache and namecache_ts.
The aim of the change is to not waste 20 bytes per small namecache
entry.
Reviewed by: jhb
MFC after: 2 weeks
X-MFC-note: after r230394
names and wants to sort them by name, ie. when executes:
# zfs list -t snapshot -o name -s name
Because only name is needed we don't have to read all snapshot properties.
Below you can find how long does it take to list 34509 snapshots from a single
disk pool before and after this change with cold and warm cache:
before:
# time zfs list -t snapshot -o name -s name > /dev/null
cold cache: 525s
warm cache: 218s
after:
# time zfs list -t snapshot -o name -s name > /dev/null
cold cache: 1.7s
warm cache: 1.1s
MFC after: 1 week
fit into existing mcontext_t.
On i386 and amd64 do return the extended FPU states using
getcontextx(3). For other architectures, getcontextx(3) returns the
same information as getcontext(2).
Tested by: pho
MFC after: 1 month
64bit and 32bit ABIs. As a side-effect, it enables AVX on capable
CPUs.
In particular:
- Query the CPU support for XSAVE, list of the supported extensions
and the required size of FPU save area. The hw.use_xsave tunable is
provided for disabling XSAVE, and hw.xsave_mask may be used to
select the enabled extensions.
- Remove the FPU save area from PCB and dynamically allocate the
(run-time sized) user save area on the top of the kernel stack,
right above the PCB. Reorganize the thread0 PCB initialization to
postpone it after BSP is queried for save area size.
- The dumppcb, stoppcbs and susppcbs now do not carry the FPU state as
well. FPU state is only useful for suspend, where it is saved in
dynamically allocated suspfpusave area.
- Use XSAVE and XRSTOR to save/restore FPU state, if supported and
enabled.
- Define new mcontext_t flag _MC_HASFPXSTATE, indicating that
mcontext_t has a valid pointer to out-of-struct extended FPU
state. Signal handlers are supplied with stack-allocated fpu
state. The sigreturn(2) and setcontext(2) syscall honour the flag,
allowing the signal handlers to inspect and manipilate extended
state in the interrupted context.
- The getcontext(2) never returns extended state, since there is no
place in the fixed-sized mcontext_t to place variable-sized save
area. And, since mcontext_t is embedded into ucontext_t, makes it
impossible to fix in a reasonable way. Instead of extending
getcontext(2) syscall, provide a sysarch(2) facility to query
extended FPU state.
- Add ptrace(2) support for getting and setting extended state; while
there, implement missed PT_I386_{GET,SET}XMMREGS for 32bit binaries.
- Change fpu_kern KPI to not expose struct fpu_kern_ctx layout to
consumers, making it opaque. Internally, struct fpu_kern_ctx now
contains a space for the extended state. Convert in-kernel consumers
of fpu_kern KPI both on i386 and amd64.
First version of the support for AVX was submitted by Tim Bird
<tim.bird am sony com> on behalf of Sony. This version was written
from scratch.
Tested by: pho (previous version), Yamagi Burmeister <lists yamagi org>
MFC after: 1 month
Currently the code is not built by any modules. That will
be fixed later. The Atmel ARM bus interface file part of this
commit is just for sake of example. All registers and bits are
declared like macros and not C-structures like in official
Synopsis header files. This driver mostly origins from the
musb_otg.c driver in FreeBSD except that the chip specific
programming has been replaced by the one for DWC 2.0 USB OTG.
Some parts related to system suspend and resume have been left
like empty functions for the future. USB suspend and resume is
fully supported.
For example, this particular topology didn't work correctly from all
nodes:
[A] - [B] - [C] - [D]
Submitted by: Monthadar Al Jaberi <monthadar@gmail.com>
Reviewed by: bschmidt, adrian
This allows for multiple MAC addresses to be printed on the same
debugging line. ether_sprintf() uses a static char buffer and
thus isn't very useful here.
Submitted by: Monthadar Al Jaberi <monthadar@gmail.com>
of root HUB. Although it is initialized with port index of the
device's parent hub, which is worng. So track the USB tree up to
root HUB and initialize this filed ptroprly
Rename port_index to root_port_index in order to reflect its
real semantics.
versions derived from /usr/ports/audio/oss.
The particular headers used were taken from the
attic/drv/oss_allegro directory and are mostly identical
to the previous files.
The Maestro3 driver is now free from the GPL.
NOTE: due to lack of testers this driver is being
considered for deprecation and removal.
PR: kern/153920
Approved by: jhb (mentor)
MFC after: 2 weeks
profiling and kernel profiling. To enable kernel profiling one has to build
kgmon(8). I will enable the build once I managed to build and test powerpc
(32-bit) kernels with profiling support.
- add a powerpc64 PROF_PROLOGUE for _mcount.
- add macros to avoid adding the PROF_PROLOGUE in certain assembly entries.
- apply these macros where needed.
- add size information to the MCOUNT function.
MFC after: 3 weeks, together with r230291
entries on one client when a directory was renamed on another client. The
root cause for the stale entry being trusted is that each per-vnode nfsnode
structure has a single 'n_ctime' timestamp used to validate positive name
cache entries. However, if there are multiple entries for a single vnode,
they all share a single timestamp. To fix this, extend the name cache
to allow filesystems to optionally store a timestamp value in each name
cache entry. The NFS clients now fetch the timestamp associated with
each name cache entry and use that to validate cache hits instead of the
timestamps previously stored in the nfsnode. Another part of the fix is
that the NFS clients now use timestamps from the post-op attributes of
RPCs when adding name cache entries rather than pulling the timestamps out
of the file's attribute cache. The latter is subject to races with other
lookups updating the attribute cache concurrently. Some more details:
- Add a variant of nfsm_postop_attr() to the old NFS client that can return
a vattr structure with a copy of the post-op attributes.
- Handle lookups of "." as a special case in the NFS clients since the name
cache does not store name cache entries for ".", so we cannot get a
useful timestamp. It didn't really make much sense to recheck the
attributes on the the directory to validate the namecache hit for "."
anyway.
- ABI compat shims for the name cache routines are present in this commit
so that it is safe to MFC.
MFC after: 2 weeks
subject "Data corruption over NFS in -current". During investigation
of this, I came across an ugly bogusity in the new NFS client where
it replaced the cr_uid with the one used for the mount. This was
done so that "system operations" like the NFSv4 Renew would be
performed as the user that did the mount. However, if any other
thread shares the credential with the one doing this operation,
it could do an RPC (or just about anything else) as the wrong cr_uid.
This patch fixes the above, by using the mount credentials instead of
the one provided as an argument for this case. It appears
to have fixed Martin's problem.
This patch is needed for NFSv4 mounts and NFSv3 mounts against
some non-FreeBSD servers that do not put post operation attributes
in the NFSv3 Statfs RPC reply.
Tested by: Martin Cracauer (cracauer at cons.org)
Reviewed by: jhb
MFC after: 2 weeks
pci_get_vpd_readonly_method(). Previously the loop was always running
to completion and falling through to failing with ENXIO.
PR: kern/164313
Submitted by: Chuck Tuffli chuck tuffli net
MFC after: 1 week
ctl_error.c,
ctl_error.h: Take out the ctl_sense_format enumeration, and use
scsi_sense_data_type instead.
Remove ctl_get_sense_format() and switch ctl_build_ua()
over to using scsi_sense_data_type.
ctl_backend_ramdisk.c,
ctl_backend_block.c:
Use C99 structure initializers instead of GNU initializers.
ctl.c: Switch over to using the SCSI sense format enumeration
instead of the CTL-specific enumeration.
Submitted by: dim (partially)
MFC after: 1 month
frightening "unknown" word. In most cases we don't need to know chips
to properly handle them, but having IDs in logs may simplify debugging.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
1. correct the initialization of RDT when there is an ixgbe_init()
while a netmap client is active. This code was previously
in ixgbe_initialize_receive_units() but RDT is overwritten
shortly afterwards in ixgbe_init_locked()
2. add code (not active yet) to disable CRCSTRIP while in netmap mode.
From all evidence i could gather, it seems that when the 82599 has to
write a data block that is not a full cache line, it first reads
the line (64 bytes) and then writes back the updated version.
This hurts reception of min-sized frames, which are only 60 bytes
if the CRC is stripped: i could never get above 11Mpps
(received from one queue) with CRCSTRIP enabled, whyle 64+4-byte
packets reach 14.2 Mpps (the theoretical maximum).
Leaving the CRC in gets us 14.88Mpps for 60+4 byte frames,
(and penalizes 64+4). The min-size case is important not just because
it looks good in benchmarks, but also because this is the size
of pure acks.
Note we cannot leave CRCSTRIP on by default because it is
incompatible with some other features (LRO etc.)
of HDA bus. Handle that from two directions:
- Add support for "striping" (using several SDO lines), if supported.
- Account HDA bus utilization and return error on new stream allocation
attempt if remaining bandwidth is unsifficient.
Most of HDA controllers have one SDO line with 46Mbps output bandwidth.
NVIDIA GF210 has 2 lines - 92Mbps. NVIDIA GF520 has 4 lines - 184Mbps!
MFC after: 2 months
Sponsored by: iXsystems, Inc.
using LOADER_TFTP_SUPPORT excludes this code. Fixes compilation of pxeldr
with -DLOADER_TFTP_SUPPORT
Applicable to stable/9 and stable/8 now.
This appears to not be needed on stable/7 as r212126 has not been MFC'd.
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
- Enable and handle unsolicited responses from digital display pins,
reporting connection and EDID-Like Data (ELD) validity status changes.
- Fetch ELD data, describing connected digital display device audio
capabilities. These data not really used at the moment (user is not
denied to use audio formats not supported by the device), only printed to
verbose logs. But they are useful for debugging. The fact that ELD was
received tells that HDMI link was established and video driver enabled
HDMI audio passthrough. Some old chips may not return ELD, so lack of it
is not necessary a problem.
- Add some more points to CODEC configuration sequence:
- For converter widgets, supporting more then two channels (HDMI/DP
converter widgets support 8), set number of channels to handle.
- For digital display pins (HDMI/DP) fill audio infoframe, reporting
connected device about number of channels and speakers allocation.
- For digital display pins (HDMI/DP) set mapping between channels seen
by software and channels transferred via HDMI/DisplayPort.
- Allow more audio formats, not used for analog connections because of
stereo pairs orientation, but easily applicable to HDMI/DisplayPort: 2.1,
3.0, 3.1, 4.1, 5.0, 6.0, 6.1, 7.0. That list may be filtered later using
info from ELD.
- Disable MSI interrupts for NVIDIA HDA controllers before GT520.
At this point I can successfully play audio over HDMI from NVIDIA GT210
and GT520 cards with nvidia-driver-290.10 driver to Marantz SR4001
receiver in 2.0, 2.1, 3.0, 4.0, 4.1, 5.0 and 5.1 PCM formats at 44, 48,
88 and 96KHz at 16 and 24 bits, same as do AC3/DTS passthrough.
6.0, 6.1, 7.0 and 7.1 PCM formats are not working for me, but I think
it is because of receiver age.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
According to the GCC documentation, the constructs used to implement
<tgmath.h> are only available in C mode. They only cause breakage when
used used with g++.
Reported by: tijl
would fit specified size. Returned mbuf may be a single mbuf,
an mbuf with a cluster from packet zone, or an mbuf with jumbo
cluster of sufficient size.
similarly named CPU instructions.
Since our in-tree binutils gas is not aware of the instructions, and
I have to use the byte-sequence to encode them, hardcode the r/m operand
as (%rdi). This way, first argument of the pseudo-function is already
placed into proper register.
MFC after: 1 week
the allocated memory before calling mtx_init(9) on mtx pointing to it.
Otherwize, random contents of uninitialized memory might occasionally
trigger the assertion.
Reported by: Pavel Polyakov <bsd kobyla org>
Reviewed by: pjd
MFC after: 1 week
filesystem running with journaled soft updates. Until these problems
have been tracked down, return ENOTSUPP when an attempt is made to
take a snapshot on a filesystem running with journaled soft updates.
MFC after: 2 weeks
to an OBJT_VNODE-specific field of the vm object. The same
information can be just as easily obtained from the struct vattr that
is in struct image_params if the latter is passed to
elf*_load_section(). Moreover, by replacing the vmspace and vm
object parameters to elf*_load_section() with a struct image_params
parameter, we actually reduce the size of the object code.
In collaboration with: kib
that the compiler promotes floats to double precision in computations, but
inspection of the output of a cross-compiler indicates that this isn't the
case on powerpc.
dynamic rounding modes, but FPUless chips that use softfloat can support it
because everything is emulated anyway. (We presently have incomplete
support for hardware FPUs.)
Submitted by: Ian Lepore
is used.
Although the module _builds_, it fails to load because of a missing symbol from
ieee80211_tdma.c.
Specifics:
* Always build ieee80211_tdma.c in the module;
* only compile in the code if IEEE80211_SUPPORT_TDMA is defined.
The USB code as it stands includes the bus glue along _with_ the controller
code. So the ohci/ehci modules actually build the USB controller code and
the PCI bus glue.
It'd be nice to ship separate modules for the PCI glue and the USB
controller (so for example if there were a USB controller hanging off
the internal SoC bus as well as an external PCI device) it could be done.
This is primarily done to save a few bytes here and there on embedded
systems with limited flash space for kernels - a very limited (sub-1MB)
space may be available for the kernel and may only support gzip encoding.
The rootfs can be LZMA compressed.
This is primarily done to save a few bytes here and there on embedded
systems with limited flash space for kernels - a very limited (sub-1MB)
space may be available for the kernel and may only support gzip encoding.
The rootfs can be LZMA compressed.
on-board, glued to the AR71xx CPU. These may forgo separate WMAC EEPROMs
(which store configuration and calibration data) and instead store
it in the main board SPI flash.
Normally the NIC reads the EEPROM attached to it to setup various PCI
configuration registers. If this isn't done, the device will probe as
something different (eg 0x168c:abcd, or 0x168c:ff??.) Other setup registers
are also written to which may control important functions.
This introduces a new compile option, AR71XX_ATH_EEPROM, which enables the
use of this particular code. The ART offset in the SPI flash can be
specified as a hint against the relevant slot/device number, for example:
hint.pcib.0.bus.0.17.0.ath_fixup_addr=0x1fff1000
hint.pcib.0.bus.0.18.0.ath_fixup_addr=0x1fff5000
TODO:
* Think of a better name;
* Make the PCIe version of this fixup code also use this option;
* Maybe also check slot 19;
* This has to happen _before_ the SPI flash is set from memory-mapped
to SPI-IO - so document that somewhere.
to being more generic.
Other embedded SoCs also throw the configuration/PCI register
info into flash.
For now I'm just hard-coding the AR9280 option (for on-board AR9220's on
AP94 and commercial designs (eg D-Link DIR-825.))
TODO:
* Figure out how to support it for all 11n SoC NICs by doing it in
ar5416InitState();
* Don't hard-code the EEPROM size - add another field which is set
by the relevant chip initialisation code.
* 'owl_eep_start_loc' may need to be overridden in some cases to 0x0.
I need to do some further digging.
to read strings completely to know the actual size.
As a side effect it fixes the issue with kern.proc.args and kern.proc.env
sysctls, which didn't return the size of available data when calling
sysctl(3) with the NULL argument for oldp.
Note, in get_ps_strings(), which does actual work for proc_getargv() and
proc_getenvv(), we still have a safety limit on the size of data read in
case of a corrupted procces stack.
Suggested by: kib
MFC after: 3 days
- Huge old hdac driver was split into three independent pieces: HDA
controller driver (hdac), HDA CODEC driver (hdacc) and HDA sudio function
driver (hdaa).
- Support for multichannel recording was added. Now, as specification
defines, driver checks input associations for pins with sequence numbers
14 and 15, and if found (usually) -- works as before, mixing signals
together. If it doesn't, it configures input association as multichannel.
- Signal tracer was improved to look for cases where several DACs/ADCs in
CODEC can work with the same audio signal. If such case found, driver
registers additional playback/record stream (channel) for the pcm device.
- New controller streams reservation mechanism was implemented. That
allows to have more pcm devices then streams supported by the controller
(usually 4 in each direction). Now it limits only number of simultaneously
transferred audio streams, that is rarely reachable and properly reported
if happens.
- Codec pins and GPIO signals configuration was exported via set of
writable sysctls. Another sysctl dev.hdaa.X.reconfig allows to trigger
driver reconfiguration in run-time.
- Driver now decodes pins location and connector type names. In some cases
it allows to hint user where on the system case connectors, related to the
pcm device, are located. Number of channels supported by pcm device,
reported now (if it is not 2), should also make search easier.
- Added workaround for digital mic on some Asus laptops/netbooks.
MFC after: 2 months
Sponsored by: iXsystems, Inc.
This function updates path string to vnode's full global path and checks
the size of the new path string against the pathlen argument.
In vfs_domount(), sys_unmount() and kern_jail_set() this new function
is used to update the supplied path argument to the respective global path.
Unbreaks jailed zfs(8) with enforce_statfs set to 1.
Reviewed by: kib
MFC after: 1 month
possible, and double faults within an SLB trap handler are not. The result
is that it possible to take an SLB fault at any time, on any address, for
any reason, at any point in the kernel.
This lets us do two important things. First, it removes the (soft) 16 GB RAM
ceiling on PPC64 as well as any architectural limitations on KVA space.
Second, it lets the kernel tolerate poorly designed hypervisors that
have a tendency to fail to restore the SLB properly after a hypervisor
context switch.
MFC after: 6 weeks
vm_object_pip_{add,subtract}() on the swap object because the swap
object can't be destroyed while the vnode is exclusively locked.
Moreover, even if the swap object could have been destroyed during
tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken
because vm_object_pip_subtract() does not wake up the sleeping thread
that is trying to destroy the swap object.
Free invalid pages after an I/O error. There is no virtue in keeping
them around in the swap object creating more work for the page daemon.
(I believe that any non-busy page in the swap object will now always
be valid.)
vm_pager_get_pages() does not return a standard errno, so its return
value should not be returned by tmpfs without translation to an errno
value.
There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to
occur with the swap object locked.
Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite().
(The swap pager already spam your console if data corruption is
imminent.)
Reviewed by: kib
MFC after: 3 weeks
M_NOWAIT. Currently, the code allows for sleeping in the ioctl path
to guarantee allocation. However code also handles ENOMEM gracefully, so
propagate this error back to user-space, rather than sleeping while
holding the global pf mutex.
Reviewed by: glebius
Discussed with: bz
vfs_mount_error error message facility provided by the nmount
interface.
Clean up formatting of mount warnings which still need to use
kernel printf's since they do not return errors.
Requested by: Craig Rodrigues <rodrigc@crodrigues.org>
MFC after: 2 weeks
the new NFSv4 server where the code follows the wrong list.
Fortunately, for these fairly rare cases, the lc_stateid[]
lists are normally empty. This patch fixes the code to
follow the correct list.
Reported by: tai.horgan at isilon.com
Discussed with: zack
MFC after: 2 weeks
On amd64, link_elf_obj.c must specify KERNBASE rather than
VM_MIN_KERNEL_ADDRESS to vm_map_find() because kernel loadable
modules must be mapped for execution in the same upper region
of the kernel map as the kernel code and data segments.
For MIPS32 KERNBASE lies below KVA area (it's less than
VM_MIN_KERNEL_ADDRESS) so basically vm_map_find got whole
KVA to look through. On MIPS64 it's not the case because
KERNBASE is set to the very end of XKSEG, well out of KVA
bounds, so vm_map_find always fails. We should use
VM_MIN_KERNEL_ADDRESS as a base for vm_map_find.
Details obtained from: alc@
as the system dump device. This was already allowed for GPT. The Linux
swap metadata at the beginning of the partition should not be disturbed
because the crash dump is written at the end.
Reviewed by: alfred, pjd, marcel
MFC after: 2 weeks
Depending on device capabilities use different methods to implement it.
Currently used method can be read/set via kern.cam.da.X.delete_method
sysctls. Possible values are:
NONE - no provisioning support reported by the device;
DISABLE - provisioning support was disabled because of errors;
ZERO - use WRITE SAME (10) command to write zeroes;
WS10 - use WRITE SAME (10) command with UNMAP bit set;
WS16 - use WRITE SAME (16) command with UNMAP bit set;
UNMAP - use UNMAP command (equivalent of the ATA DSM TRIM command).
The last two methods (UNMAP and WS16) are defined by SBC specification and
the UNMAP method is the most advanced one. The rest of methods I've found
supported in Linux, and as soon as they were trivial to implement, then
why not? Hope they will be useful in some cases.
Unluckily I have no devices properly reporting parameters of the logical
block provisioning support via respective VPD pages (0xB0 and 0xB2). So
all info I have/use now is the flag telling whether logical block
provisioning is supported or not. As result, specific methods chosen now
by trying different ones in order (UNMAP, WS16, DISABLE) and checking
completion status to fallback if needed. I don't expect problems from this,
as if something go wrong, it should just disable itself. It may disable
even too aggressively if only some command parameter misfit.
Unlike Linux, which executes each delete with separate request, I've
implemented here the same request aggregation as implemented in ada driver.
Tests on SSDs I have show much better results doing it this way: above
8GB/s of the linear delete on Intel SATA SSD on LSI SAS HBA (mps).
Reviewed by: silence on scsi@
MFC after: 2 month
Sponsored by: iXsystems, Inc.
1. as reported by Alexander Fiveg, the allocator was reporting
half of the allocated memory. Fix this by exiting from the
loop earlier (not too critical because this code is going
away soon).
2. following a discussion on freebsd-current
http://lists.freebsd.org/pipermail/freebsd-current/2012-January/031144.html
turns out that (re)loading the dmamap was expensive and not optimized.
This operation is in the critical path when doing zero-copy forwarding
between interfaces.
At least on netmap and i386/amd64, the bus_dmamap_load can be
completely bypassed if the map is NULL, so we do it.
The latter change gives an almost 3x improvement in forwarding
performance, from the previous 9.5Mpps at 2.9GHz to the current
line rate (14.2Mpps) at 1.733GHz. (this is for 64+4 byte packets,
in other configurations the PCIe bus is a bottleneck).
802.1q-defined 16-bit VID, CFI, and PCP field in host by order) and a
VLAN ID (VID). Tags go in packets. VIDs identify VLANs.
No functional change is intended, so this should be safe to MFC. Further
cleanup with functional changes will be committed separately (for example,
renaming vlan_tag/vlan_tag_p, which modify the KPI and KBI).
Reviewed by: bz
Sponsored by: ADARA Networks, Inc.
MFC after: 3 days
in the CAM XPT bus traversal code, and a number of other periph level
issues.
cam_periph.h,
cam_periph.c: Modify cam_periph_acquire() to test the CAM_PERIPH_INVALID
flag prior to allowing a reference count to be gained
on a peripheral. Callers of this function will receive
CAM_REQ_CMP_ERR status in the situation of attempting to
reference an invalidated periph. This guarantees that
a peripheral scheduled for a deferred free will not
be accessed during its wait for destruction.
Panic during attempts to drop a reference count on
a peripheral that already has a zero reference count.
In cam_periph_list(), use a local sbuf with SBUF_FIXEDLEN
set so that mallocs do not occur while the xpt topology
lock is held, regardless of the allocation policy of the
passed in sbuf.
Add a new routine, cam_periph_release_locked_buses(),
that can be called when the caller already holds
the CAM topology lock.
Add some extra debugging for duplicate peripheral
allocations in cam_periph_alloc().
Treat CAM_DEV_NOT_THERE much the same as a selection
timeout (AC_LOST_DEVICE is emitted), but forgo retries.
cam_xpt.c: Revamp the way the EDT traversal code does locking
and reference counting. This was broken, since it
assumed that the EDT would not change during
traversal, but that assumption is no longer valid.
So, to prevent devices from going away while we
traverse the EDT, make sure we properly lock
everything and hold references on devices that
we are using.
The two peripheral driver traversal routines should
be examined. xptpdperiphtraverse() holds the
topology lock for the entire time it runs.
xptperiphtraverse() is now locked properly, but
only holds the topology lock while it is traversing
the list, and not while the traversal function is
running.
The bus locking code in xptbustraverse() should
also be revisited at a later time, since it is
complex and should probably be simplified.
scsi_da.c: Pay attention to the return value from cam_periph_acquire().
Return 0 always from daclose() even if the disk is now gone.
Add some rudimentary error injection support.
scsi_sg.c: Fix reference counting in the sg(4) driver.
The sg driver was calling cam_periph_release() on close,
but never called cam_periph_acquire() (which increments
the reference count) on open.
The periph code correctly complained that the sg(4)
driver was trying to decrement the refcount when it
was already 0.
Sponsored by: Spectra Logic
MFC after: 2 weeks
CTL is a disk and processor device emulation subsystem originally written
for Copan Systems under Linux starting in 2003. It has been shipping in
Copan (now SGI) products since 2005.
It was ported to FreeBSD in 2008, and thanks to an agreement between SGI
(who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is
available under a BSD-style license. The intent behind the agreement was
that Spectra would work to get CTL into the FreeBSD tree.
Some CTL features:
- Disk and processor device emulation.
- Tagged queueing
- SCSI task attribute support (ordered, head of queue, simple tags)
- SCSI implicit command ordering support. (e.g. if a read follows a mode
select, the read will be blocked until the mode select completes.)
- Full task management support (abort, LUN reset, target reset, etc.)
- Support for multiple ports
- Support for multiple simultaneous initiators
- Support for multiple simultaneous backing stores
- Persistent reservation support
- Mode sense/select support
- Error injection support
- High Availability support (1)
- All I/O handled in-kernel, no userland context switch overhead.
(1) HA Support is just an API stub, and needs much more to be fully
functional.
ctl.c: The core of CTL. Command handlers and processing,
character driver, and HA support are here.
ctl.h: Basic function declarations and data structures.
ctl_backend.c,
ctl_backend.h: The basic CTL backend API.
ctl_backend_block.c,
ctl_backend_block.h: The block and file backend. This allows for using
a disk or a file as the backing store for a LUN.
Multiple threads are started to do I/O to the
backing device, primarily because the VFS API
requires that to get any concurrency.
ctl_backend_ramdisk.c: A "fake" ramdisk backend. It only allocates a
small amount of memory to act as a source and sink
for reads and writes from an initiator. Therefore
it cannot be used for any real data, but it can be
used to test for throughput. It can also be used
to test initiators' support for extremely large LUNs.
ctl_cmd_table.c: This is a table with all 256 possible SCSI opcodes,
and command handler functions defined for supported
opcodes.
ctl_debug.h: Debugging support.
ctl_error.c,
ctl_error.h: CTL-specific wrappers around the CAM sense building
functions.
ctl_frontend.c,
ctl_frontend.h: These files define the basic CTL frontend port API.
ctl_frontend_cam_sim.c: This is a CTL frontend port that is also a CAM SIM.
This frontend allows for using CTL without any
target-capable hardware. So any LUNs you create in
CTL are visible in CAM via this port.
ctl_frontend_internal.c,
ctl_frontend_internal.h:
This is a frontend port written for Copan to do
some system-specific tasks that required sending
commands into CTL from inside the kernel. This
isn't entirely relevant to FreeBSD in general,
but can perhaps be repurposed.
ctl_ha.h: This is a stubbed-out High Availability API. Much
more is needed for full HA support. See the
comments in the header and the description of what
is needed in the README.ctl.txt file for more
details.
ctl_io.h: This defines most of the core CTL I/O structures.
union ctl_io is conceptually very similar to CAM's
union ccb.
ctl_ioctl.h: This defines all ioctls available through the CTL
character device, and the data structures needed
for those ioctls.
ctl_mem_pool.c,
ctl_mem_pool.h: Generic memory pool implementation used by the
internal frontend.
ctl_private.h: Private data structres (e.g. CTL softc) and
function prototypes. This also includes the SCSI
vendor and product names used by CTL.
ctl_scsi_all.c,
ctl_scsi_all.h: CTL wrappers around CAM sense printing functions.
ctl_ser_table.c: Command serialization table. This defines what
happens when one type of command is followed by
another type of command.
ctl_util.c,
ctl_util.h: CTL utility functions, primarily designed to be
used from userland. See ctladm for the primary
consumer of these functions. These include CDB
building functions.
scsi_ctl.c: CAM target peripheral driver and CTL frontend port.
This is the path into CTL for commands from
target-capable hardware/SIMs.
README.ctl.txt: CTL code features, roadmap, to-do list.
usr.sbin/Makefile: Add ctladm.
ctladm/Makefile,
ctladm/ctladm.8,
ctladm/ctladm.c,
ctladm/ctladm.h,
ctladm/util.c: ctladm(8) is the CTL management utility.
It fills a role similar to camcontrol(8).
It allow configuring LUNs, issuing commands,
injecting errors and various other control
functions.
usr.bin/Makefile: Add ctlstat.
ctlstat/Makefile
ctlstat/ctlstat.8,
ctlstat/ctlstat.c: ctlstat(8) fills a role similar to iostat(8).
It reports I/O statistics for CTL.
sys/conf/files: Add CTL files.
sys/conf/NOTES: Add device ctl.
sys/cam/scsi_all.h: To conform to more recent specs, the inquiry CDB
length field is now 2 bytes long.
Add several mode page definitions for CTL.
sys/cam/scsi_all.c: Handle the new 2 byte inquiry length.
sys/dev/ciss/ciss.c,
sys/dev/ata/atapi-cam.c,
sys/cam/scsi/scsi_targ_bh.c,
scsi_target/scsi_cmds.c,
mlxcontrol/interface.c: Update for 2 byte inquiry length field.
scsi_da.h: Add versions of the format and rigid disk pages
that are in a more reasonable format for CTL.
amd64/conf/GENERIC,
i386/conf/GENERIC,
ia64/conf/GENERIC,
sparc64/conf/GENERIC: Add device ctl.
i386/conf/PAE: The CTL frontend SIM at least does not compile
cleanly on PAE.
Sponsored by: Copan Systems, SGI and Spectra Logic
MFC after: 1 month
This uses the emuxkireg.h already used in the emu10k1
snd driver. Special thanks go to Alexander Motin as
he was able to find some errors and reverse engineer
some wrong values in the emuxkireg header.
The emu10kx driver is now free from the GPL.
PR: 153901
Tested by: mav, joel
Approved by: jhb (mentor)
MFC after: 2 weeks
- Define schednetisr() to swi_sched.
- In the swi handler check if there is some data prepared,
and if true, then call pfsync_sendout(), however tell it
not to schedule swi again.
- Since now we don't obtain the pfsync lock in the swi handler,
don't use ifqueue mutex to synchronize queue access.
This introduces:
* a basic wtap interface
* a HAL, which implements an abstraction layer for implementing
different device behavious;
* A visibility plugin, which allows for control over which nodes
see other nodes (useful for mesh work.)
It doesn't yet implement sta/adhoc/hostap modes but these are quite
feasible to implement.
Monthadar uses it to do 802.11s mesh verification.
The userland tools will be committed in a follow-up commit.
Submitted by: Monthadar Al Jaberi <monthadar@gmail.com>
Add the ability for /dev/null and /dev/zero to accept
being set into non blocking mode via fcntl(). This
brings the code into compliance with IEEE Std 1003.1-2001
as referenced in another PR, 94729.
Reviewed by: jhb
MFC after: 1 week
revision 1.128
date: 2009/08/16 13:01:57; author: jsg; state: Exp; lines: +1 -5
remove prototypes of a bunch of functions that had their implementations
removed in pfsync v5.
would go negative after using the "-z" option to zero out the stats.
This patch fixes that by not zeroing out the srvcache_size field
for "-z", since it is the size of the cache and not a counter.
MFC after: 2 weeks
the memory allocator used by netmap. No functional change,
two small bug fixes:
- in if_re.c add a missing bus_dmamap_sync()
- in netmap.c comment out a spurious free() in an error handling block
u_int. With the auto-sized buffer cache on the modern machines, UFS
metadata can generate more the 65535 pages belonging to the buffers
undergoing i/o, overflowing the counter.
Reported and tested by: jimharris
Reviewed by: alc
MFC after: 1 week
bpf_iflist list which reference the specified ifnet. The existing implementation
only removes the first matching bpf_if found in the list, effectively leaking
list entries if an ifnet has been bpfattach()ed multiple times with different
DLTs.
Fix the leak by performing the detach logic in a loop, stopping when all bpf_if
structs referencing the specified ifnet have been detached and removed from the
bpf_iflist list.
Whilst here, also:
- Remove the unnecessary "bp->bif_ifp == NULL" check, as a bpf_if should never
exist in the list with a NULL ifnet pointer.
- Except when INVARIANTS is in the kernel config, silently ignore the case where
no bpf_if referencing the specified ifnet is found, as it is harmless and does
not require user attention.
Reviewed by: csjp
MFC after: 1 week
from the interactive loader(8) prompt and beastie_disable="YES" is set
in loader.conf(5). In this case menu.rc is not evaluated and consequently
menu-unset does not have a body yet. This results in the ficl warning
"menu-unset not found" when try-menu-unset invokes menu-unset.
Check for beastie_disable="YES" explicitly, so that the try-menu-unset
word will not attempt to invoke menu-unset because the menu will have
never been configured. [1]
Use the sfind primitive as a last resort as an additional safer approach
conjuring a foreign word safely. [2]
PR: kern/163938
Submitted by: Devin Teske [1]
Reviewed by: Devin Teske [2]
Reported and tested by: dim
MFC after: 1 week
X-MFC with: r228985
o Make the pfsync.ko actually usable. Before this change loading it
didn't register protosw, so was a nop. However, a module /boot/kernel
did confused users.
o Rewrite the way we are joining multicast group:
- Move multicast initialization/destruction to separate functions.
- Don't allocate memory if we aren't going to join a multicast group.
- Use modern API for joining/leaving multicast group.
- Now the utterly wrong pfsync_ifdetach() isn't needed.
o Move module initialization from SYSINIT(9) to moduledata_t method.
o Refuse to unload module, unless asked forcibly.
o Improve a bit some FreeBSD porting code:
- Use separate malloc type.
- Simplify swi sheduling.
This change is probably wrong from VIMAGE viewpoint, however pfsync
wasn't VIMAGE-correct before this change, too.
Glanced at by: bz
destroyed prior to pfsync_uninit(). To do this, move all the
initialization to the module_t method, instead of SYSINIT(9).
o Fix another panic after module unload, due to not clearing the
m_addr_chg_pf_p pointer.
o Refuse to unload module, unless being unloaded forcibly.
o Revert the sub argument to MODULE_DECLARE, to the stable/8 value.
This change probably isn't correct from viewpoint of VIMAGE, but
the module wasn't VIMAGE-correct before the change, as well.
Glanced at by: bz
The vfs_busy() is after covered vnode lock in the global lock order, but
since quotaon() does recursive VFS call to open quota file, we usually
end up locking covered vnode after mp is busied in sys_quotactl().
Change the interface of VFS_QUOTACTL(), requiring that mp was unbusied
by fs code, and do not try to pick up vfs_busy() reference in ufs quotaon,
esp. if vfs_busy cannot succeed due to unmount being performed.
Reported and tested by: pho
MFC after: 1 week
operation on POSIX shared memory objects and tmpfs. Previously, neither of
these modules correctly handled the case in which the new size of the object
or file was not a multiple of the page size. Specifically, they did not
handle partial page truncation of data stored on swap. As a result, stale
data might later be returned to an application.
Interestingly, a data inconsistency was less likely to occur under tmpfs
than POSIX shared memory objects. The reason being that a different mistake
by the tmpfs truncation operation helped avoid a data inconsistency. If the
data was still resident in memory in a PG_CACHED page, then the tmpfs
truncation operation would reactivate that page, zero the truncated portion,
and leave the page pinned in memory. More precisely, the benevolent error
was that the truncation operation didn't add the reactivated page to any of
the paging queues, effectively pinning the page. This page would remain
pinned until the file was destroyed or the page was read or written. With
this change, the page is now added to the inactive queue.
Discussed with: jhb
Reviewed by: kib (an earlier version)
MFC after: 3 weeks
in the ARP datagram generated by arprequest(). If caller doesn't
supply the address, then it is either picked from CARP or hardware
address of the interface is taken.
While here, make several minor fixes:
- Hold IF_ADDR_RLOCK(ifp) while traversing address list.
- Remove not true comment.
- Access internet address and mask via in_ifaddr fields,
rather than ifaddr.
If set to 1, no ABORT is sent back in response to an incoming
INIT. If set to 2, no ABORT is sent back in response to
an out of the blue packet. If set to 0 (the default), ABORTs
are sent.
Discussed with rrs@.
MFC after: 1 month.
- Use Elf32_Addr as default, the only field that is
64 bitw wide is R_MIPS_64
- Add R_MIPS_HIGHER and R_MIPS_HGHEST handlers
- Handle R_MIPS_HI16 and R_MIPS_LO16 for both .rel and
.rela sections
The effect of this was, for clients mounted via inet6 addresses,
that the DRC cache would never have a hit in the server. It also
broke NFSv4 callbacks when an inet6 address was the only one available
in the client. This patch fixes the above, plus deletes opt_inet6.h
from a couple of files it is not needed for.
MFC after: 2 weeks
seem to be used elsewhere.
Since UFS_ACL is enabled by default for GENERIC kernels, this shouldn't
break anything - but please beat me to fix things if it does.
This reduces the footprint of the kernel on small embedded systems
(think <1MB flash for the compressed kernel image) just enough to
actually fit.
where they've disabled all the wireless devices/framework.
This is just a build workaround. If you're actively using wireless,
you must still define AH_SUPPORT_AR5416 as I'm not sure what else
will break!
The real solution is to make the module build depend if AH_SUPPORT_AR5416
is defined, as well as make the 11n code in if_ath_tx.c and if_ath_tx_ht.c
completely optional (maybe depend upon ATH_SUPPORT_11N.)
revision 1.170
date: 2011/10/30 23:04:38; author: mikeb; state: Exp; lines: +6 -7
Allow setting big MTU values on the pfsync interface but not larger
than the syncdev MTU. Prompted by the discussion with and tested
by Maxim Bourmistrov; ok dlg, mpf
Consistently use sc_ifp->if_mtu in the MTU check throughout the
module. This backs out r228813.
This was preventing the ath driver from being loaded at runtime.
It worked fine when compiled statically into the kernel but not when
kldload'ed after the system booted.
The root cause was that PCIR_INTLINE (register 60) was being
overwritten by zeros when register 62 was being written to.
A subsequent read of this register would return 0, and thus
the rest of the PCI glue assumed an IRQ resource had already
been allocated. This caused the device to fail to attach at
runtime as the device itself didn't contain any IRQ resources.
TODO: go back over the ar71xx and ar724x PCI config read/write
code and ensure it's correct.
value used in sys/ofed/include/linux/netdevice.h), so there will be no
buffer overruns in the rest of the inline functions in this file.
Reviewed by: kmacy
MFC after: 1 week