195702, 195703, and 195821 prevented a thread from suspending while holding
locks inside of NFS by forcing the thread to fail sleeps with EINTR or
ERESTART but defer the thread suspension to the user boundary. However,
this had the effect that stopping a process during an NFS request could
abort the request and trigger EINTR errors that were visible to userland
processes (previously the thread would have suspended and completed the
request once it was resumed).
This change instead effectively masks stop signals while in the NFS client.
It uses the existing TDF_SBDRY flag to effect this since SIGSTOP cannot
be masked directly. Also, instead of setting PBDRY on individual sleeps,
the NFS client now sets the TDF_SBDRY flag around each NFS request and
stop signals are masked for all sleeps during that region (the previous
change missed sleeps in lockmgr locks). The end result is that stop
signals sent to threads performing an NFS request are completely
ignored until after the NFS request has finished processing and the
thread prepares to return to userland. This restores the behavior of
stop signals being transparent to userland processes while still
preventing threads from suspending while holding NFS locks.
Reviewed by: kib
MFC after: 1 month
using /dev/consolectl close. This fixes a problem where if
a USB mouse is detached while a button is pressed, that
button is never released.
MFC after: 1 week
- Add const quilifiers to fields that store value of __FILE__.
- Use long type for fields that store value of __LINE__.
- Sort and style(9) debugging fields.
- Add initializer for debugging fields into TAILQ_INITIALIZER macro.
PR: 175759
Submitted by: Andrey Simonenko <simon comsys.ntu-kpi.kiev.ua>
Reviewed by: bde
This eliminates the need to recompile the kernel when the default value
of NKPT is not big enough - for e.g. when loading large kernel modules
or memory disk images from the loader.
If NKPT is defined in the kernel configuration file then it overrides the
dynamic calculation.
Reviewed by: alc, kib
requires 8 bytes alignment on RX buffer. Given that non-jumbo
frame works on any alignments I guess this DMA limitation for RX
buffer could be jumbo frame specific one. Also I'm not sure
whether this DMA limitation is related with 64bit DMA. Previously
age(4) disabled 64bit DMA addressing due to silent data corruption.
So we may need more testing on re-enabling 64bit DMA in future.
While I'm here, change mbuf chaining algorithm to use fixed sized
buffer and force software checksum if controller reports length
error. According to QAC, RFD is not updated at all for jumbo frame
so it works just like alc(4) controllers. This change also added
alignment fixup for strict alignment architectures. Because I'm
not aware of any non-x86 machines that use age(4) controllers it's
just for completeness at this moment.
Wit this change, jumbo frame should work with age(4).
Tested by: Christian Gusenbauer < c47g <> gmx dot at >
MFC after: 1 week
The check is copied from vnet_ng_ether_init.
Not sure if it covers all the types that we want to support with
ng_ether.
Reported by: markj
Discussed with: zec
MFC after: 10 days
X-MFC with: r246245
Now they are split into two pairs: page_hold/page_unhold for mappedread
and page_busy/page_unbusy for update_pages.
For mappedread we simply hold a page that is to be used as a source if it
is resident and valid (and not busy). This is sufficient since we are
only doing page -> user buffer copying. There is no page <-> backing
storage I/O involved.
update_pages is now better split to properly handle the putpages case
(page -> arc) and the regular write case (arc -> page).
For the latter we use complete protocol of marking an object with
paging-in-progress and marking a page with io_start (busy count).
Also, in this case we remove the write bit from all page mappings and
clear dirty bits of the pages, the former is needed to ensure that the
latter does the right thing.
Additionally we update a page if it is cached instead of just freeing it
as was done before. This needs to be verified.
A minor detail: ZFS-backed pages should always be either fully valid
or fully invalid. Assert this and use simpler API that does not deal
with sub-page blocks.
Reviewed by: kib
MFC after: 26 days
has gone below zero after the blocks in its inode are freed is a
no-op which the compiler fails to warn about because of the use of
the DIP macro. Change the sanity check to compare the number of
blocks being freed against the value i_blocks. If the number of
blocks being freed exceeds i_blocks, just set i_blocks to zero.
Reported by: Pedro Giffuni (pfg@)
MFC after: 2 weeks
Only during very early boot, before malloc(9) is functional (SI_SUB_KMEM),
the static ktr_buf_init is used. Size of the static buffer is determined
by a new kernel option KTR_BOOT_ENTRIES. Its default value is 1024.
This commit builds on top of r243046.
Reviewed by: alc
MFC after: 17 days
Major changes:
* Finally tracked down the flow control setting that
seems to have been causing TX stalls and watchdog timeouts
* RX and TX paths now share a lot more code
* TX interrupt is no longer used; we instead GC finished
tx queue entries at the bottom of the start routine.
* TX start now queues fragmented packets directly; it only
invokes defrag() for occasional very fragmented packets.
* "sysctl dev.cpsw" dumps controller statistics and queue counts
* Host Error Interrupt will give extensive debugging information
if the controller chokes on the queued data.
- Remove unused extern declarations in fs.h
- Correct comments in ext2_dir.h
- Several panic() messages showed wrong function names.
- Remove commented out stray line in ext2_alloc.c.
- Remove the unused macro EXT2_BLOCK_SIZE_BITS() and the then
write-only member e2fs_blocksize_bits from struct m_ext2fs.
- Remove the unused macro EXT2_FIRST_INO() and the then write-only
member e2fs_first_inode from struct m_ext2fs.
- Remove EXT2_DESC_PER_BLOCK() and the member e2fs_descpb from
struct m_ext2fs.
- Remove the unused members e2fs_bmask, e2fs_dbpg and
e2fs_mount_opt from struct m_ext2fs
- Correct harmless off-by-one error for fspath in ext2_vfsops.c.
- Remove the unused and broken macros EXT2_ADDR_PER_BLOCK_BITS()
and EXT2_DESC_PER_BLOCK_BITS().
- Remove the !_KERNEL versions of the EXT2_* macros.
Submitted by: Christoph Mallon
MFC after: 2 weeks
passed in by smartd of smartmontools.
While at it, hint the compiler that 32-bit PIO is the most likely
case (idea from Linux) and use bus_{read,write}_stream_2(9) instead
of bus_{read,write}_multi_stream_2(9) for single count reads/writes.
MFC after: 1 week
This hack is picked up from Linux, which claims that it follows
Windows behavior.
PR: amd64/174409
Tested by: Sergey V. Dyatko <sergey.dyatko@gmail.com>,
KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>,
Slawa Olhovchenkov <slw@zxy.spb.ru>
MFC after: 13 days
- change 'pics' from STAILQ to TAILQ
- ensure that Local APIC is always first in 'pics'
Reviewed by: jhb
Tested by: Sergey V. Dyatko <sergey.dyatko@gmail.com>,
KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>
MFC after: 12 days
And provide kernel compiler version as a sysctl as well.
This is useful while we have gcc and clang cohabitation.
This could be even more useful when we have support
for external toolchains.
In cooperation with: mjg
MFC after: 13 days
Also sanitize interface names that can potentially contain characters
that are prohibited in netgraph names.
PR: kern/154850 (sanitizing of names)
Discussed with: eri, melifaro
Submitted by: Nikolay Denev <ndenev@gmail.com> (sanitizing code)
Reviewed by: eri, glebius
MFC after: 17 days
- there is no such flag in Solaris and derivatives
- the flag was added in an unrelated change
- the flag is not used
The proper way to allocate zeroed out memory is to use kmem_zalloc.
MFC after: 3 days
x86 buses
Otherwise the uart hardware could be in such a state after the resume
where IER is cleared and thus no interrupts are generated.
This behavior is observed and tested with QEMU, so I am comitting this
change to help with my debugging.
There has been no feedback from users of serial ports on real hardware.
MFC after: 20 days
This should allow the kernel linker to easily detect a situation
when the module is present both in a kernel and in a preloaded file
(zfs.ko).
Reviewed by: jhb
MFC after: 5 days
track the MNT_SYNCHRONOUS flag. It is set to the latter at mount time
but not updated by MNT_UPDATE.
Use MNT_SYNCHRONOUS to decide to write the FAT updates syncrhonously.
Submitted by: bde
MFC after: 1 week
a directory to a subdir of the root directory from somewhere else.
For all directory moves that change the parent directory, the dotdot
entry must be fixed up. For msdosfs, the root directory is magic for
non-FAT32. It is less magic for FAT32, but needs the same magic for
the dotdot fixup. It didn't have it.
Both chkdsk and fsck_msdosfs fix the corrupt directory entries with no
problems.
The fix is to use the same magic for dotdot in msdosfs_rename() as in
msdosfs_mkdir().
For msdosfs_mkdir(), document the magic. When writing the dotdot entry
in mkdir, use explicitly set pcl variable instead on relying on the
start cluster of the root directory typically has a value < 65536.
Submitted by: bde
MFC after: 1 week
Trying FAT32 on a small partition failed to mount because
pmp->pm_Sectors was nonzero. Normally, FAT32 file systems are so
large that the 16-bit pm_Sectors can't hold the size. This is
indicated by setting it to 0 and using only pm_HugeSectors. But at
least old versions of newfs_msdos use the 16-bit field if possible,
and msdosfs supports this except for breaking its own support in the
sanity check. This is quite different from the handling of pm_FATsecs
-- now the 16-bit value is always ignored for FAT32 except for
checking that it is 0, and newfs_msdos doesn't use the 16-bit value
for FAT32.
Submitted by: bde
MFC after: 1 week
this check is somewhere in the network code, but this assertion
already proven to be useful in catching what seems to be driver bugs
causing NFS scrambling random memory.
Discussed with: rmacklem
MFC after: 1 week
zero on slower machines, which make the fenced get_timecount methods
not used despite needed. Remove the (shift > 0) condition when
selecting the get_timecount() implementation.
Rename smp_tsc_shift to tsc_shift, and apply it for the UP case too.
Allow shift to reach value of 31 instead of 30, as it was previously
(should be nop).
Reorganize the tc quality calculation to remove the conditionally
compiled block. Rename test_smp_tsc() to test_tsc() and provide
separate versions for SMP and UP builds. The check for virtialized
hardware is more natural to perform in the smp version of the
test_tsc(), since it is only done for smp case.
Noted and reviewed by: bde (previous version)
MFC after: 12 days
VM_KMEM_SIZE_SCALE specifies which fraction of the available physical
memory, after deduction of the kernel itself and other early statically
allocated memory, can be used for the kmem_map. The kmem_map provides
for all UMA/malloc allocations in KVM space.
Previously ARM was using a fixed kmem_map size of (12*1024*1024) = 12MB
without regard to effectively available memory. This is too small for
recent ARM SoC with more than 128MB of RAM.
For reference a description of others related kmem_map parameters:
VM_KMEM_SIZE default start size of kmem_map if SCALE is
not defined
VM_KMEM_SIZE_MIN hard floor on the kmem_map size
VM_KMEM_SIZE_MAX hard ceiling on the kmem_map size
VM_KMEM_SIZE_SCALE fraction of the available real memory to
be used for the kmem_map, limited by the
MIN and MAX parameters.
Tested by: ian
MFC after: 1 week
The "blackhole" driver was used in conjunction with bhyve to sequester
pci devices intended for passthru until vmm.ko was loaded. This was
useful at one point because vmm.ko could not be loaded at boot time.
The same functionality can now be achieved by loading vmm.ko via the
loader along with the kernel.
Discussed with: grehan
Obtained from: NetApp
can only be located at the beginning or the end of the BAR.
If the MSI-table is located in the middle of a BAR then we will split the
BAR into two and create two mappings - one before the table and one after
the table - leaving a hole in place of the table so accesses to it can be
trapped and emulated.
Obtained from: NetApp
The maximum length of an environment variable puts a limitation on the
number of passthru devices that can be specified via a single variable.
The workaround is to allow user to specify passthru devices via multiple
environment variables instead of a single one.
Obtained from: NetApp
case 0x3E: /* Per Intel document 325462-045US 01/2013. */
Add manpage to document all the goodness that is available in this
processor model.
No support for uncore events at this time.
Submitted by: hiren panchasara <hiren.panchasara@gmail.com>
Reviewed by: davide, jimharris, sbruno
Obtained from: Yahoo! Inc.
MFC after: 2 weeks
Since ARP and routing are separated, "proxy only" entries
don't have any meaning, thus we don't need additional field
in sockaddr to pass SIN_PROXY flag.
New kernel is binary compatible with old tools, since sizes
of sockaddr_inarp and sockaddr_in match, and sa_family are
filled with same value.
The structure declaration is left for compatibility with
third party software, but in tree code no longer use it.
Reviewed by: ru, andre, net@
Right now, ic_curchan seems to be updated rather quickly (ie, during
the ioctl) and before the driver gets notified of what's going on.
So what I was seeing was:
* NIC was in channel X;
* It generates PHY errors for channel X;
* an ioctl comes along from userland and changes things to channel Y;
* .. this updates ic_curchan, but hasn't yet reset the hardware;
* in parallel, RX is occuring and it looks at ic_curchan;
* .. which is channel Y, so events get stamped with that now.
Sigh.
into the FreeBSD boot loader, typically for non-USB aware BIOSes, EFI systems
or embedded platforms. This is also useful for out of the system compilation
of the FreeBSD USB stack for various purposes. The USB kernel files can
now optionally include a global header file which should include all needed
definitions required to compile the FreeBSD USB stack. When the global USB
header file is included, no other USB header files will be included by
default.
Add new file containing the USB stack configuration for the
FreeBSD loader build.
Replace some __FBSDID()'s by /* $FreeBSD$ */ comments. Now all
USB files follow the same style.
Use cases:
- console in loader via USB
- loading kernel via USB
Discussed with: Hiroki Sato, hrs @ EuroBSDCon
in kern_wait6(), which is called by kern_wait(). Remove the redundand
check, introduced in r243136, and add a comment noting this, to make
the code less confusing.
The blank lines are added to properly delineate the scope of the
preceeding comments.
Noted by: "Jukka A. Ukkonen" <jau@iki.fi>
MFC after: 1 week
but use normal references instead of weak. This makes the statically
linked binaries to use fast gettimeofday(2) by forcing the linker to
resolve references and providing the neccessary functions.
Reported by: bde
Tested by: marius (sparc64)
MFC after: 2 weeks
timecounter to 1, and correspondingly increase the precision of the
gettimeofday(2) and related functions in the default configuration.
The motivation for the TSC-low timecounter, as described in the
r222866, seems to provide a workaround for the non-serializing
behaviour of the RDTSC on some Intel hardware. Tests demonstrate that
even with the pre-shift of 8, the cross-core non-monotonicity of the
RDTSC is still observed reliably, e.g. on the Nehalems. The r238755
and r238973 implemented the proper fix for the issue.
The pre-shift of 1 is applied to keep TSC not overflowing for the
frequency of hardclock down to 2 sec/intr. The pre-shift is made a
tunable to allow the easy debugging of the issues users could see with
the shift being too low.
Reviewed by: bde
MFC after: 2 weeks
bug in old versions of QEMU (and Xen, and other places using QEMU code).
On those buggy emulated UARTs, the "TX idle" interrupt gets lost; with
this workaround, we spinwait for the TX to happen and then send ourselves
the interrupt. It's ugly but it works, while minimizing the impact on
the code for the !broken_txfifo case.
MFC after: 2 weeks
In all the routines that loop through a range of virtual addresses, the loop
is controlled by subtracting the cache line size from the total length of the
request. After the subtract, a 'bpl' instruction was used, which branches if
the result of the subtraction is zero or greater, but we need to exit the
loop when the count hits zero. Thus, all the bpl instructions in those loops
have been changed to 'bhi' (branch if greater than zero).
In addition, the two routines that walk through the cache using set-and-index
were correct, but confusing. The loop control for those has been simplified,
just so that it's easier to see by examination that the code is correct.
Routines for other arm architectures and generations still have the bpl
instruction, but compensate for the off-by-one situation by decrementing
the count register by one before entering the loop.
PR: arm/174461
Approved by: cognet (mentor)
If a BUSDMA load operation results in a single segment which
is greater than the PAGE_SIZE, the USB computed physical
addresses will not be correct. Make sure that the first
segment is unfolded like the sub-sequent segments are into
USB_PAGE_SIZE big ranges.
Found by: Alexander Nedotsukov
MFC after: 1 week
requested from the server for the read operation. Server shall not
reply with too large size, but client should be resilent too.
Reviewed by: rmacklem
MFC after: 1 week
flush wait on the Gen2 chipsets. Confirmed by the inspection of the
Linux agp code.
Submitted by: Taku YAMAMOTO <taku@tackymt.homeip.net>
MFC after: 2 weeks
This adds support for version 10, revision 01, but it should also work
without changes for the 0901 model, at least until we get drivers for the
two different wifi chips involved.
Many users contributed to and tested the various patchsets floating around
for the past year that have eventually evolved into this checkin, most notably
Richard Neese who provided the bulk of the kernel config file.
Approved by: cognet (mentor)
so that we don't need an empty implementation of it for every Marvell platform
that has no PCI. This allows the removal of the SheevaPlug-specific stub and
config files, and eliminates the need to add similar stubs for future models.
Marvell platforms that do expose PCI are compiled with 'device pci' which
causes the real (non-weak) implementation in dev/fdt/fdt_pci.c to be used.
Approved by: cognet (mentor)
the prior commit. Use essentially the same sprintf() statement for both
formatting and pre-formatting, and use a format string which eliminates the
need for an extra temporary buffer when formatting the name.
Noted by: Christoph Mallon
Pointy hat to: ian
Approved by: cognet (mentor)
It seems that old ZFS versions (v15) completely omit "vdev_children"
property when there is a single child.
Reported by: jase
Tested by: jase
MFC after: 1 week
cannot be freed while do_pass_accept_req is running. This closes a race
where do_pass_establish on another CPU (the driver chose a different
queue for the new tid) expands the synq entry into a full PCB and then
releases the only hold on it, all while do_pass_accept_req is still
running.
MFC after: 3 days
* Add HTINFO field decoding to ieee80211_ies_expand() - it's likely not
100% correct as it's not looking at the draft 11n HTINFO location,
but I don't think anyone will care.
* When doing an IBSS join make sure the 11n channel configuration
is used - otherwise the 11a/11bg channel will be used
and there won't be any chance for an upgrade to 11n.
* When creating an IBSS network, ensure the channel is updated to an
11n channel so other 11n nodes can see it and speak to it with MCS
rates.
* Add a bit of code that's disabled for now which handles the HT
field updating. This won't work out very well with lots of adhoc
nodes as we'd end up ping-ponging between the HT configuration for
each node. Instead, we should likely only pay attention to the
"master" node we initially associated against and then ensure we
propagate that information forward in our subsequent beacons. However,
due to the nature of IBSS (ie, there's no specific "master" node in
the specification) it's unclear which node we should lift the HT
parameters from.
So for now this assumes the HT parameters are squirreled away in the
initial beacon/probe response.
So there's some trickiness here.
With ap/sta pairing, the probe response just populates a legacy node
and the association request/response is what is used for negotiation
11n-ness (and upgrading things as needed.)
With ibss networks, the pairing is done with probe request/response,
with discovery being done by creating nodes when new beacons in the
IBSS / BSSID are heard. There's no assoc request/response frames going on.
So the trick here has been to figure out where to upgrade things.
I don't like how I just taught ieee80211_sta_join() to "speak" HT -
I'd rather there be an upgrade path when an IBSS node joins and there
are HT parameters present. Once I've done that, I'll kill this
HT special casing that's going on in ieee80211_sta_join().
Tested:
* AR9280, AR5416, AR5212 - basic iperf and ping interoperability tests
whilst in a non-encrypted adhoc network.
TODO:
* Fix up the HT upgrade path for IBSS nodes rather than adding code
in ieee80211_sta_join(), then remove my code from there.
* When associating, there's a concept of a "master" node in the IBSS
which is the node you first joined the network through. It's possible
the correct thing to do is to listen to HT updates and configure WME
parameters from that node. However, once that node goes away, which
node(s) should be listened to for configuration changes?
For things like HT channel width, it's likely going to be ok to
just associate as HT40 and then use the per-neighbor rate control
and HTINFO/HTCAP fields to figure out which rates and configuration
to speak. Ie, for a 20MHz 11n node, just speak 20MHz rates to
it. It shouldn't "change", like what goes on in AP/STA configurations.
the separate ath0 TX taskq.
Whilst here, make sure that the TX software scheduler is also
running out of the TX task, rather than the ath0 taskqueue.
Make sure that the tx taskqueue is blocked/unblocked as necessary.
This allows for a little more parallelism on multi-core machines,
as well as (eventually) supporting a higher task priority for TX
tasks, allowing said TX task to preempt an already running RX or
TX completion task.
Tested:
* AR5416, AR9280 hostap and STA modes
- Make bge_lookup_{rev,vendor}() static.
- Factor out chip identification rather than duplicating the code.
- Sanitize bge_probe() a bit (don't hardcode buffer sizes, allow
bge_lookup_vendor() to return NULL so the excessive panic() three
can be removed there, etc.) and return BUS_PROBE_DEFAULT rather than
hardcoding 0.
- According to the Linux tg3 driver, BCM57791 and BCM57795 aren't
capable of Gigabit Ethernet.
- Check the return value of taskqueue_start_threads().
lle_event replaced arp_update_event after the ARP rewrite and ended up
in if_ether.h simply because arp_update_event used to be there too.
IPv6 neighbor discovery is going to grow lle_event support and this is a
good time to move it to if_llatbl.h.
The two in-tree consumers of this event - OFED and toecore - are not
affected.
Reviewed by: bz@
- At least the Saturn chips of 501-6738 cards need a delay after freezing
the external GMII pins before the internal PHY is accessible again. So
wait a bit after (un)freezing these. Also don't touch the other bits of
that configuration register. [1]
- Take advantage of nitems().
Reported and tested by: Paul Keusemann [1]
MFC after: 3 days
- Use NFSD_MONOSEC (which maps to time_uptime) instead of the seconds
portion of wall-time stamps to manage timeouts on events.
- Remove unused nd_starttime from the per-request structure in the new
NFS server.
- Use nanotime() for the modification time on a delegation to get as
precise a time as possible.
- Use time_second instead of extracting the second from a call to
getmicrotime().
Submitted by: bde (3)
Reviewed by: bde, rmacklem
MFC after: 2 weeks
in the man page and its header counterpart.
Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (initial version)
Reviewed and further improved by: bde (previous version)
All bugs are: mine
The changes are:
- the microcore code loaded into the NAE has to be byteswapped
in LE
- the descriptors in memory for a P2P NAE descriptor has to be
byteswapped in LE
- the m_data pointer is already cacheline aligned, so the
unnecessary m_adj to cacheline size can be removed
- fix mask used to obtain physical address from the Tx freeback
descriptor
- fix a compile error in code under #ifdef
Obtained from: Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
The CMS output queue credit configuration register is 64 bit, so use
a 64 bit variable while updating it.
Obtained from: Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
Update MDIO reset code to support Broadcom XLP B1 revisions.
Update nlm_xlpge_ioctl, nlm_xlpge_port_enable need not be
called after nlm_xlpge_init.
Obtained from: Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
Support few more versions of board firmware. In case the security
block is disabled, enable it at boot. Also increase the excluded
memory region to cover the area used by the firmware to initialize
devices.
Update the function xlp_pcib_hardware_swap_enable() to do nothing
when BYTE_ORDER is not BIG_ENDIAN. PCIe hardware swap is not requred
in little-endian mode as the endianness matches that of CPU.
reading registers from other CPUs. As it turns out, the hardware doesn't
really like concurrent IPI'ing causing adverse effects. Also the thought
deadlock when using this spin lock here and the targeted CPU(s) are also
holding or in case of nested locks can't actually happen. This is due to
the fact that on sparc64, spinlock_enter() only raises the PIL but doesn't
disable interrupts completely. Thus direct cross calls as used for the
register reading (and all other MD IPI needs) still will be executed by
the targeted CPU(s) in that case.
MFC after: 3 days
FreeBSD TCP-level socket options (only the first two are). Instead,
using a mapping function and fail unsupported options as we do for other
socket option levels.
MFC after: 2 weeks
comconsole setup. Previously the hint would be set when if you set a
custom port, but it would not be updated if you later set a custom speed.
Also, leave the hw.uart.console hint mutable so it can be overridden or
unset by the user if needed.
Reviewed by: kib (earlier version)
MFC after: 1 week
The previous change accidentally left the substraction we
were trying to avoid in case that i_blocks could become
negative.
Reported by: bde
MFC after: 4 days
By setting dev.netmap.fwd=1 (or enabling the feature with a per-ring flag),
packets are forwarded between the NIC and the host stack unless the
netmap client clears the NS_FORWARD flag on the individual descriptors.
This feature greatly simplifies applications where some traffic
(think of ARP, control traffic, ssh sessions...) must be processed
by the host stack, whereas the bulk is handled by the netmap process
which simply (un)marks packets that should not be forwarded.
The default is chosen so that now a netmap receiver operates
in a mode very similar to bpf.
Of course there is no free lunch: traffic to/from the host stack
still operates at OS speed (or less, as there is one extra copy in
one direction).
HOWEVER, since traffic goes to the user process before being
reinjected, and reinjection occurs in a user context, you get some
form of livelock protection for free.
Ext2fs uses unsigned fields in its dinode struct.
FreeBSD can have negative values in some of those
fields and the inode is meant to interact with the
system so we have never respected the unsigned
nature of most of those fields.
Block numbers and the NFS generation number do
not need to be signed so redefine them as
unsigned to better match the on-disk information.
MFC after: 1 week
Add a missing 0 to the mask for byte0 of C_SIZE.
The previous mask (0xc) worked except that the last 0-1536K of the disk
could not be accessed since we were shifting the (wrong) bits we did
mask off the right edge.
Testing with fsx has revealed problems and in order to
hunt the bugs properly we need reduce the complexity.
This seems to help but is not a complete solution.
MFC after: 3 days
logic (refer to [1] for associated discussion). snd_cwnd and snd_wnd are
unsigned long and on 64 bit hosts, min() will truncate them to 32 bits and could
therefore potentially corrupt the result (although under normal operation,
neither variable should legitmately exceed 32 bits).
[1] http://lists.freebsd.org/pipermail/freebsd-net/2013-January/034297.html
Submitted by: jhb
MFC after: 1 week
generating binary diffs.
- Constify a few strings used in the driver.
- Style changes to make the driver compile with default clang settings.
Approved by: HighPoint Technologies
MFC after: 3 days
gcc handles -symbolic by passing -Bsymbolic through to ld. clang ignores
-symbolic and thus invokes ld without -Bsymbolic which leads to some symbols
not being properly linked in loader.efi. Fix this by using -Wl,-Bsymbolic which
passes -Bsymbolic to ld in both the gcc and clang cases.
Approved by: rpaulo
This is easily possible now that the TX is protected by a single
lock, rather than a per-TXQ (and thus per-TID) lock.
Only set CLRDMASK if none of the destinations are filtered.
This likely will need some tuning when it comes time to do UASPD/PS-POLL
TX, however at that point it should be manually set anyway.
Tested:
* AR9280, STA mode
TODO:
* More thorough testing in AP mode
* test other chipsets, just to be safe/sure.
that 'smp_started != 0'.
This is required because the VT-x initialization calls smp_rendezvous()
to set the CR4_VMXE bit on all the cpus.
With this change we can preload vmm.ko from the loader.
Reported by: alfred@, sbruno@
Obtained from: NetApp
arch_zfs_probe method is supposed to only probe for ZFS vdevs, but it can
not expect that ZFS data is in a ready state yet.
So, move some code from sparc64_zfs_probe to main to meet the constraints.
Reported by: Chris Ross <cross+freebsd@distal.com>
Tested by: Chris Ross <cross+freebsd@distal.com>
MFC after: 4 days
'bhyve' was developed by grehan@ and myself at NetApp (thanks!).
Special thanks to Peter Snyder, Joe Caradonna and Michael Dexter for their
support and encouragement.
Obtained from: NetApp
Make umass return an error code if SCSI sense retrieval request
has failed. Make sure scsi_error_action honors SF_NO_RETRY and
SF_NO_RECOVERY in all cases, even if it cannot parse sense bytes.
Reviewed by: hselasky (umass), scottl (cam)
interrupt counts and names, by making the names into an array of fixed-length
strings that can be directly indexed. This eliminates extra memory accesses
on every interrupt to increment the counts.
As a side effect, it also fixes a bug that would corrupt the names data
if a name was longer than MAXCOMLEN, which led to incorrect vmstat -i output.
Approved by: cognet (mentor)
to the old one's nfs.nfsrv.async.
Please note that by enabling this option (default is disabled), the system
could potentionally have silent data corruption if the server crashes
before write is committed to non-volatile storage, as the client side have
no way to tell if the data is already written.
Submitted by: rmacklem
MFC after: 2 weeks
two upcoming features:
semi-transparent mode:
when a device is opened in this mode, the
user program will be able to mark slots that must be forwarded
to the "other" side (i.e. from NIC to host stack, or viceversa),
and the forwarding will occur automatically at the next netmap syscall.
This saves the need to open another file descriptor and do
the forwarding manually.
direct-forwarding mode:
when operating with a VALE port, the user can specify in the slot
the actual destination port, overriding the forwarding decision
made by a lookup of the destination MAC. This can be useful to
implement packet dispatchers.
No API changes will be introduced.
No new functionality in this patch yet.
CPUs exhibit bad behavior if this is done (Intel Errata AAJ3, hangs on
Pentium-M, and trashing of the local APIC registers on a VIA C7). The
local APIC is implicitly mapped UC already via MTRRs, so the clflush isn't
necessary anyway.
MFC after: 2 weeks
tunable_mbinit() where it is next to where it is used later.
Change the sysinit level of tunable_mbinit() from SI_SUB_TUNABLES
to SI_SUB_KMEM after the VM is running. This allows to use better
methods to determine the effectively available physical and virtual
memory available to the kernel.
Update comments.
In a second step it can be merged into mbuf_init().
a process has an auditid/preselection masks specified, and
is jailed, include the zonename (jailname) token as a
part of the audit record.
Reviewed by: pjd
MFC after: 2 weeks
chip hangs.
* Always do a reset in ath_bmiss_proc(), regardless of whether the
hardware is "hung" or not. Specifically, for spectral scan, there's
likely a whole bunch of potential hangs that we don't (yet) recognise
in the HAL. So to avoid staying RX deaf persisting until the station
disassociates, just do a no-loss reset.
* Set sc_beacons=1 in STA mode. During a reset, the beacon programming
isn't done. (It's likely I need to set sc_syncbeacons during a hang
reset, but I digress.) Thus after a reset, there's no beacon timer
programming to send a BMISS interrupt if beacons aren't heard ..
thus if the AP disappears, you won't get notified and you'll have to
reset your interface.
This hasn't yet fixed all of the hangs that I've seen when debugging
spectral scan, but it's certainly reduced the hang frequency and it
should improve general STA stability in very noisy environments.
Tested:
* AR9280, STA mode, spectral scan off/on
PR: kern/175227
when an interface is going down.
Right now it's quite possible (but very unlikely!) that ath_reset()
or similar is called, leading to a beacon config call, in parallel with
the last VAP being destroyed.
This likely should be fixed by making sure the bmiss/bstuck/watchdog
taskqueues are canceled whenever the last VAP is destroyed.
in r245004. Although the report was for noatime option which is
non-functional for the nullfs, other standard options like nosuid or
noexec are useful with it.
Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au>
MFC after: 3 days
by returning an error of EINTR rather than EACCES.
- While here, bring back some (but not all) of the NFS RPC statistics lost
when krpc was committed.
Reviewed by: rmacklem
MFC after: 1 week
serial devices (such as printer ports). This allows ppc devices attached
to puc to correctly setup an interrupt handler and work.
Tested by: Andre Albsmeier Andre.Albsmeier@siemens.com
MFC after: 1 week
When maxusers was unrestricted and maxfiles was allowed to autotune
much higher the result was that ncallout which was based on maxfiles
and maxproc grew much higher than was needed.
To fix this clip autotuning to the same number we would get with
the old maxusers algorithm which would stop scaling at 384
maxusers.
Growing ncalout higher is not likely to be needed since most consumers
of timeout(9) are gone and any higher value for ncallout causes the
callwheel hashes to be much larger than will even be needed for
most applications.
MFC after: 1 month
Reviewed by: mav