LLAs on the member interfaces are actually harmless when the parent
interface does not have a LLA.
- Add net.link.bridge.allow_llz_overlap. This is a knob to allow LLAs on
a bridge and the member interfaces at the same time. The default is 0.
Pointed out by: ume
MFC after: 3 days
must be destroyed, knlist_clear() and seldrain() calls could be
avoided, since vpollinfo was not used. More, the knlist_clear()
calling protocol requires the knlist locked, which is not true at the
call site.
Split the destruction into the helper destroy_vpollinfo_free(), and
call it when raced, instead of destroy_vpollinfo().
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
the normal and the mesh transmit paths can use.
The API is a bit horrible because it both consumes the mbuf and frees
the node reference regardless of whether it succeeds or not.
It's a hold-over from how the code behaves; it'd be nice to have it
not free the node reference / mbuf if TX fails and let the caller
decide what to do.
While these operations are not really needed otherwise, at least for SCSI
they may cause extra errors if some other initiator holds write exclusive
reservation on the LUN (SYNCHRONIZE CACHE handled as "write" operation).
The original API calls for pow2ns, however the new APIs from
Linux call for seconds.
We need to be able to convert to/from 2^Nns to seconds in both
userland and kernel to fix this and properly compare units.
no longer have the parent in the device tree. This causes the identify
function in ipmi_isa.c to attempt to probe and poke at the ISA IPMI interface
Move the check for ipmi_attached out of the ipmi_isa_attach function and into
the ipmi_isa_identify function. Remove the check of the device tree for
ipmi devices attached.
This probing appears to make Broadcom management firmware on Dell machines
crash and emit NMI EISA warnings at various times requiring power cycles
of the machines to restore.
Bump MAX_TIMEOUT to 6 seconds as a hack for super slow IPMI interfaces that
need longer to respond to our intial probes on startup.
Tested on Dell R410, R510, R815, HP DL160G6
This is MFC candidate for 9.2R
Reviewed by: peter
MFC after: 2 weeks
Sponsored by: Yahoo! Inc.
- Don't short-circuit aging tests for unmapped objects. This biases
against unmapped file pages and transient mappings.
- Always honor PGA_REFERENCED. We can now use this after soft busying
to lazily restart the LRU.
- Don't transition directly from active to cached bypassing the inactive
queue. This frees recently used data much too early.
- Rename actcount to act_delta to be more consistent with use and meaning.
Reviewed by: kib, alc
Sponsored by: EMC / Isilon Storage Division
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.
Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire. On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc. It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel). On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.
There are 4 tracers on a chip. A tracer can trace only in one direction
(tx or rx). For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port. This is a
small subset of what the hardware can do. A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.
/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire
If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.
/* all done */
# cxgbetool t5nex0 tracer 0 disable
# cxgbetool t5nex0 tracer 1 disable
# cxgbetool t5nex0 tracer list
# ifconfig t5nex0 destroy
sysctl tree.
* Create a net.link.lagg.X.lacp node
* Add a debug node under that for tx_test and rx_test
* Add lacp_strict_mode, defaulting to 1
tx_test and rx_test are still a bitmap of unit numbers for now.
At some point it would be nice to create child nodes of the lagg bundle
for each sub-interface, and then populate those with various knobs
and statistics.
Sponsored by: Netflix
This eliminates some unusual uses of that API in favor of more typical
uses of kmem_malloc().
Discussed with: kib/alc
Tested by: pho
Sponsored by: EMC / Isilon Storage Division
Before this change path matching had the following features:
- for device nodes the patterns were matched against full path
- in the above case '/' in a path could be matched by a wildcard
- for directories and links only the last component was matched
So, for example, a pattern like 're*' could match the following entries:
- re0 device
- responder/u0 device
- zvol/recpool directory
Although it was possible to work around this behavior (once it was spotted
and understood), it was very confusing and contrary to documentation.
Now we always match a full path for all types of devfs entries (devices,
directories, links) and a '/' has to be matched explicitly.
This behavior follows the shell globbing rules.
This change is originally developed by Jaakko Heinonen.
Many thanks!
PR: kern/122838
Submitted by: jh
MFC after: 4 weeks
and protocol 1 are USB ethernet adapters. This avoids keeping and updating
the product list every now and then. This patch will add support for the
USB ethernet interface found in the IPAD.
MFC after: 1 week
we call device-specific probe functions, which can (and typically will)
set the device description based on low-level device probe information.
In the end we never actually used the device description that we so
carefully maintained in the PCI match table. By setting the device
description after we call uart_probe(), we'll print the more user-
friendly description by default.
- Only enable UDP/TCP hardware checksums if CSUM_UDP or CSUM_TCP is set.
- Only enable IP hardware checksums if CSUM_IP is set.
PR: kern/180430
Submitted by: Meny Yossefi <menyy@mellanox.com>
MFC after: 1 week
as in <sys/dirent.h>
ext2_readdir() has always been very fs specific and different
with respect to its ufs_ counterpart. Recent changes from UFS
have made it possible to share more closely the implementation.
MFUFS r252438:
Always start parsing at DIRBLKSIZ aligned offset, skip first entries if
uio_offset is not DIRBLKSIZ aligned. Return EINVAL if buffer is too
small for single entry.
Preallocate buffer for cookies.
Skip entries with zero inode number.
Reviewed by: gleb, Zheng Liu
MFC after: 1 month
SVN r95378 refactored ahc_9005_subdevinfo_valid out into a separate
function but swapped the vendor/subvendor and device/subdevice pairs of
the parameters.
Found by: Coverity Prevent, CID 744931
Reviewed by: gibbs
This function is called 4 times in this file, with swapped parameter
ordering. Fix the function definition instead of all the call sites.
16bit/stereo or 8bit/mono playback is unaffected and was probably
working fine before, this should fix 16bit/mono and 8bit/stereo
playback.
Found by: Coverity Scan, CID 1006688
locks don't accidentally appear to have been already
initialized.
In particular, this fixes a consistent kernel crash on
armv6 with:
panic: lock "vm map (user)" 0xc09cc050 already initialized
that appeared with r251709.
PR: arm/180820
Kernel include files (i.e. sys/*.h) come first; normally, include
<sys/types.h> OR <sys/param.h>, but not both. <sys/types.h> includes
<sys/cdefs.h>, and it is okay to depend on that.
to those that are universally administered. While it is possible to
add locally administered MAC addresses, it's unclear whether those
are (expected) to be more unique than random multicast MAC addresses
or not.
With many U-Boot configurations assigning fixed and non-official MAC
addresses to ethernet ports and without setting the 'X' flag, this
change may have very little value in the embedded (development)
space. Uniqueness of the universally administered addresses is non-
existent on the (H/W) bench and questionable under the (S/W) desk.
In short: this change is aimed at production environments...
- move init and fini code into separate functions (like it is done upstream)
- invoke fini code via shutdown_post_sync event hook
This should make zfs close its underlying devices during shutdown,
which may be important for their drivers.
MFC after: 20 days
Also directly call swapper() at the end of mi_startup instead of
relying on swapper being the last thing in sysinits order.
Rationale:
- "RUN_SCHEDULER" was misleading, scheduling already takes place at that stage
- "scheduler" was misleading, the function swaps in the swapped out processes
- another SYSINIT(SI_SUB_RUN_SCHEDULER, SI_ORDER_ANY) could never be
invoked depending on its relative order with scheduler; this was not obvious
and the bug actually used to exist
Reviewed by: kib (ealier version)
MFC after: 14 days
All other places where a znode is allocated do not need z_vnode at all.
These are:
- zfs_create_share_dir
- zfs_create_fs
This chnage ensures two things:
- VN_LOCK_ASHARE is not erroneously called for VFIFO vnodes
- vn_lock is called on a fully constructed vnode with correct v_ops
The change also allows to make zfs_znode_cache_constructor a normal
kmem_cache constructor again (as it is in upstream).
This allows to avoid a problem where zfs_znode_cache_destructor
may be called on un-constructed znodes.
MFC after: 17 days
This time it is for a git mirror that stores svn revisions as
git notes, e.g. https://github.com/freebsd/freebsd
MFC after: 10 days
Sponsored by: HybridCluster
addresses added to the UUID generator using uuid_ether_add(). The
UUID generator keeps an arbitrary number of MAC addresses, under
the assumption that they are rarely removed (= uuid_ether_del()).
This achieves the following:
1. It brings up closer to having the network stack as a loadable
module.
2. It allows the UUID generator to filter MAC addresses for best
results (= highest chance of uniqeness).
3. MAC addresses can come from anywhere, irrespactive of whether
it's used for an interface or not.
A side-effect of the change is that when no MAC addresses have been
added, a random multicast MAC address is created once and re-used if
needed. Previusly, when a random MAC address was needed, it was
created for every call. Thus, a change in behaviour is introduced
for when no MAC addresses exist.
Obtained from: Juniper Networks, Inc.
to be interpreted as a superpage. This is because PG_PTE_PAT is at the same
bit position in PTE as PG_PS is in a PDE.
This caused a number of regressions on amd64 systems: panic when starting
X applications, freeze during shutdown etc.
Pointy hat to: me
Tested by: gperez@entel.upc.edu, joel, dumbbell
Reviewed by: kib
structure is used, but they already have equal fields in the struct
newipsecstat, that was introduced with FAST_IPSEC and then was merged
together with old ipsecstat structure.
This fixes kernel stack overflow on some architectures after migration
ipsecstat to PCPU counters.
Reported by: Taku YAMAMOTO, Maciej Milewski
arswitch_writereg() routine was writing the registers in the wrong order.
Revert -r241918 as the root problem is now fixed. Remove another workaround
from arswitch_ar7240.c.
Simplify and fix the code on arswitch_writephy() by using
arswitch_writereg().
While here remove a redundant declaration from arswitchvar.h.
Approved by: adrian (mentor)
This fix the case when etherswitch is printing the information of port 0
vlan group (in port based vlan mode) with no member ports.
Add the ETHERSWITCH_VID_VALID support to ip17x driver.
Add the ETHERSWITCH_VID_VALID support to rt8366 driver.
arswitch doesn't need to be updated as it doesn't support vlans management
yet.
Approved by: adrian (mentor)
bus number into the bus argument. The bus number occupies the least
significant 8 bits. The PCI domain occupies the most significant 24
bits.
On the Altix 350, the PCI domain is a required parameter, but
changing the prototype of the pci_cfgreg*() functions to include a
separate domain argument has wide-spread consequences across the
supported architectures. We'd be changing a known interface.
Multiplexing is an acceptable kluge to give us what we need with
manageable impact. Note that the PCI bus number fits in 8 bits,
so the multiplexing of the domain is a backward compatible change.
fall within the first 256MB of memory. The origin/reason for that
limitation is not known, but it's not believed to be required for
proper initialization. What is known is that the Altix 350 does not
have physical memory at that address (by virtue of the address space
bits).
Keep the boundary at 256MB so that the info block will be covered
by a single direct-mapped translation.
While here, change the flags to M_NOWAIT to eliminate confusion. It
does not change the behaviour of contigmalloc(). What is does is
makes the flags argument explicitly say what the actual behaviour
is.
memory descriptor, don't return NULL as the virtual address, return the
direct-mapped uncacheable virtual address for it. At first, this was
needed only for the Altix 350, but now even some high-end HP machines
have devices mapped to physical addresses that aren't covered by the
EFI memory map.
The racct code in sys_munlock() assumed that the boundaries provided by the
userland were correct as long as vm_map_unwire() returned successfully.
However the latter contains its own logic and sometimes manages to do something
out of those boundaries, even if they are buggy. This change makes the racct
code to use the accounting done by the vm layer, as it is done in other places
such as vm_mlock().
Despite fixing the panic, Alan Cox pointed that this code is still race-y
though: two simultaneous callers will produce incorrect values.
Reviewed by: alc
MFC after: 7 days
---------------------------------------------------------------
System panics during a Port reset with ouststanding I/O
---------------------------------------------------------------
It is possible to call mps_mapping_free_memory after this
memory is already freed, causing a panic. Removed this extra
call to mps_mappiing_free_memory and call mps_mapping_exit
in place of the mps_mapping_free_memory call so that any
outstanding mapping items can be flushed before memory is
freed.
---------------------------------------------------------------
Correct memory leak during a Port reset with ouststanding I/O
---------------------------------------------------------------
In mps_reinit function, the mapping memory was not being
freed before being re-allocated. Added line to call the
memory free function for mapping memory.
---------------------------------------------------------------
Use CAM_SIM_QUEUED flag in Driver IO path.
---------------------------------------------------------------
This flag informs the XPT that successful abort of a CCB
requires an abort ccb to be issued to the SIM. While
processing SCSI IO's, set the CAM_SIM_QUEUED flag in the
status for the IO. When the command completes, clear this
flag.
---------------------------------------------------------------
Check for CAM_REQ_INPROG in I/O path.
---------------------------------------------------------------
Added a check in mpssas_action_scsiio for the In Progress
status for the IO. If this flag is set, the IO has already
been aborted by the upper layer (before CAM_SIM_QUEUED was
set) and there is no need to send the IO. The request will
be completed without error.
---------------------------------------------------------------
Improve "doorbell handshake method" for mps_get_iocfacts
---------------------------------------------------------------
Removed call to get Port Facts since this information is
not used currently.
Added mps_iocfacts_allocate function to allocate memory
that is based on IOC Facts data. Added mps_iocfacts_free
function to free memory that is based on IOC Facts data.
Both of the functions are used when a Diag Reset is performed
or when the driver is attached/detached. This is needed in
case IOC Facts changes after a Diag Reset, which could
happen if FW is upgraded.
Moved call of mps_bases_static_config_pages from the attach
routine to after the IOC is ready to process accesses based
on the new memory allocations (instead of polling through
the Doorbell).
---------------------------------------------------------------
Set TimeStamp in INIT message in millisecond format Set the IOC
---------------------------------------------------------------
---------------------------------------------------------------
Prefer mps_wait_command to mps_request_polled
---------------------------------------------------------------
Instead of using mps_request_polled, call mps_wait_command
whenever possible. Change the mps_wait_command function to
check the current context and either use interrupt context
or poll if required by using the pause or DELAY function.
Added a check after waiting 50mSecs to see if the command
has timed out. This is only done if polliing, the msleep
command will automatically timeout if the command has taken
too long to complete.
---------------------------------------------------------------
Integrated RAID: Volume Activation Failed error message is
displayed though the volume has been activated.
---------------------------------------------------------------
Instead of failing an IOCTL request that does not have a
large enough buffer to hold the complete reply, copy as
much data from the reply as possible into the user's buffer
and log a message saying that the user's buffer was smaller
than the returned data.
---------------------------------------------------------------
mapping_add_new_device failure due to persistent table FULL
---------------------------------------------------------------
When a new device is added, if it is determined that the
device persistent table is being used and is full, instead
of displaying a message for this condition every time, only
log a message if the MPS_INFO bit is set in the debug_flags.
Submitted by: LSI
MFC after: 1 week
Add a PIM_NOSCAN flag to the CAM path inquiry CCB. This tells CAM
not to perform a rescan on a bus when it is registered.
We now use this flag in the mps(4) driver. Since it knows what
devices it has attached, it is more efficient for it to just issue
a target rescan on the targets that are attached.
Also, remove the private rescan thread from the mps(4) driver in
favor of the rescan thread already built into CAM. Without this
change, but with the change above, the MPS scanner could run before
or during CAM's initial setup, which would cause duplicate device
reprobes and announcements.
sys/param.h:
Bump __FreeBSD_version to 1000039 for the inclusion of the
PIM_RESCAN CAM path inquiry flag.
sys/cam/cam_ccb.h:
sys/cam/cam_xpt.c:
Added a PIM_NOSCAN flag. If a SIM sets this in the path
inquiry ccb, then CAM won't rescan the bus in
xpt_bus_regsister.
sys/dev/mps/mps_sas.c
For versions of FreeBSD that have the PIM_NOSCAN path
inquiry flag, don't freeze the sim queue during scanning,
because CAM won't be scanning this bus. Instead, hold
up the boot. Don't call mpssas_rescan_target in
mpssas_startup_decrement; it's redundant and I don't
know why it was in there.
Set PIM_NOSCAN in path inquiry CCBs.
Remove methods related to the internal rescan daemon.
Always use async events to trigger a probe for EEDP support.
In older versions of FreeBSD where AC_ADVINFO_CHANGED is
not available, use AC_FOUND_DEVICE and issue the
necessary READ CAPACITY manually.
Provide a path to xpt_register_async() so that we only
receive events for our own SCSI domain.
Improve error reporting in cases where setup for EEDP
detection fails.
sys/dev/mps/mps_sas.h:
Remove softc flags and data related to the scanner thread.
sys/dev/mps/mps_sas_lsi.c:
Unconditionally rescan the target whenever a device is added.
Sponsored by: Spectra Logic
MFC after: 1 week
USB mouse and USB modem classes. Hopefully someone will find
these examples useful when implementing USB device side drivers
using the FreeBSD USB stack.
cross into regions which are within MSS bytes of a 4GB boundary.
If we encounter the condition, drop the packet.
Reviewed by: Geans Pin geanspin@Broacom
The Block Event Interrupts, BEI, feature does not
work like expected with the Renesas XHCI chipsets.
Revert feature.
While at it correct the TD SIZE computation in
case of Zero Length Packet, ZLP, in the end of a
multi frame USB transfer.
MFC after: 1 week
PR: usb/180726
for consumption outside the vfs_aio.c.
For SIGEV_THREAD_ID and SIGEV_SIGNAL notification delivery methods,
also copy in the sigev_value, since librt event pumping loop compares
note generation number with the value passed through sigev_value.
Tested by: Petr Salinger <Petr.Salinger@seznam.cz>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
DB120 development board.
The AR934x SoCs are a MIPS74k based system with increased RAM addressing
space, some scratch-pad RAM, an improved gige switch PHY and 2x2 or 3x3
on-board dual-band wifi.
This support isn't complete by any stretch; it's just enough to bring
the board up for others to tinker with. Notably, the MIPS74k support
is broken. However it boots enough to echo some basic probe/attach
messages, before dying somewhere in the TLB code.
Thankyou to Qualcomm Atheros for their continued support of me doing
open source work with their hardware.
Tested:
* AR9344, mips74k
This code reads the PLL configuration registers and correctly programs
things so the UART and such can come up.
There's MIPS74k platform issues that need fixing; but this at least brings
things up enough to echo stuff out the serial port and allow for interactive
debugging with ddb.
Tested:
* AR71xx SoCs
* AR933x SoC
* AR9344 board (DB120)
Obtained from: Qualcomm Atheros; Linux/OpenWRT
For all pre-AR933x chips, the frequency is just the APB frequency.
For the AR933x, the UART frequency is different but we just hacked around
it.
For the AR934x, there's a different PLL setting for these, so they have
to be broken out.
the attribute bitmap argument would be non-zero. This caused an
interoperability problem for a recent patch to the Linux NFSv4 client.
The Linux folks have changed their patch to avoid this, but this
patch fixes the problem on the server.
Reported and tested by: Andre Heider (a.heider@gmail.com)
MFC after: 3 days
The creation time support breaks the data structures used in linux
fuse. libfuse carries it's own header.
Revert the changes for now. We will try to get an agreement with the
fuse upstream maintainers to avoid having to patch the library
headers all the time.
instruction set. Thumb-2 requires an if-then instruction to implement
conditional codes.
When building for ARM mode the it-then instructions do not generate any
assembled instruction as per the ARMv7-A Architecture Reference Manual, and
are safe to use.
While this allows the atomic instructions to be built, it doesn't mean we
fully support Thumb code. It works in small tests, but is still known to
fail in a large number of places.
While here add a check for the armv6t2 architecture.
new 1Gb server controller chip that will be going into production
soon.
BCM5725 combines MAC with triple-speed PHY, a Network Controller
Sideband Interface (NC-SI) and on-chip memory buffer in a single
device. BCM5725 has an Application Processing Engine (APE) that is
capable of on-chip management and offloading features. BCM5725
supports high-precision clock, time stamp registers for
receive/transmit packets and programmable trigger inputs and
watchdog timeouts. These new features are not yet supported by
bge(4).
Many thanks to Broadcom for continuing to support FreeBSD!
Submitted by: Geans Pin geanspin@Broacom (initial version)
Reviewed by: Geans Pin geanspin@Broacom
H/W donated by: Broadcom
Recalculate FUSE_COMPAT_ENTRY_OUT_SIZE and COMPAT_ATTR_OUT_SIZE.
These were wrong in the previous commit. They are actually unused
in FreeBSD though.
Pointed out by: Jan Beich
When birthtime was added (r253331) we missed adding the weight
of the new fields in FUSE_COMPAT_ENTRY_OUT_SIZE and
COMPAT_ATTR_OUT_SIZE. Adjust them accordingly.
Pointed out by: Jan Beich
As part of this commit, add an nvme_strvis() function which borrows
heavily from cam_strvis(). This will allow stripping of
leading/trailing whitespace and also handle unprintable characters
in model/serial numbers. This function goes into a new nvme_util.c
file which is used by both the driver and nvmecontrol.
Sponsored by: Intel
Reviewed by: carl
MFC after: 3 days
Recent testing with QEMU that has variable sector size support for
NVMe uncovered some of these issues. Chatham prototype boards supported
only 512 byte sectors.
Sponsored by: Intel
Reviewed by: carl
MFC after: 3 days
- Add a new address space allocation method (VMFS_OPTIMAL_SPACE) for
vm_map_find() that will try to alter the alignment of a mapping to match
any existing superpage mappings of the object being mapped. If no
suitable address range is found with the necessary alignment,
vm_map_find() will fall back to using the simple first-fit strategy
(VMFS_ANY_SPACE).
- Change mmap() without MAP_FIXED, shmat(), and the GEM mapping ioctl to
use VMFS_OPTIMAL_SPACE instead of VMFS_ANY_SPACE.
Reviewed by: alc (earlier version)
MFC after: 2 weeks
beasts still exist unfortunately. More details can be found in other
references, but the short version is that bridges with this bit set ignore
I/O port ranges that alias to valid ISA I/O port ranges. In the driver
this requires not allocating these alias regions from the parent device
(so they are free to be acquired by ISA devices), and ensuring no child
devices use resources from these alias regions.
- Change the pcib_window structure to allow for an array of backing
resources rather than a single resource and update the existing code
to cope with this. Some of the coping requires using the saved
base and limit values in pcib_window instead of using rman operations
on the backing resource.
- Add special handling for allocating and adjusting the I/O port window
of an ISA-enabled bridge to only allocate the non-alias ranges and
add those to the associated resource manager.
- Reject I/O port allocations for a fixed request that conflicts with an
ISA alias range.
- Remove the "no prefected decode" verbose printf during boot. The absence
of a "prefetched decode" line is sufficient.
- Replace the "subtractively decoded bridge" verbose printf with a single
printf that lists all the "special" decoding modes of a bridge: ISA,
subtractive, and VGA.
- Add a custom bus_release_resource() method to the PCI bus driver so that
it can properly free resources for I/O windows of PCI-PCI bridges.
(These resources are not stored in the bridge device's resource list.)
PR: misc/179033
MFC after: 2 weeks
generic and apply to all sysfs attributes:
- Use sysctl_handle_string() instead of reimplementing it.
- Remove trailing newline from the current value before passing it to
userland and append a newline to the new string value before passing it
to the attribute's store function.
- Don't leak the temporary buffer if the first error check triggers.
- Revert earlier change to mlx4 port mode handler.
PR: kern/174213
Submitted by: Garrett Cooper
Reviewed by: Shakar Klein @ Mellanox
MFC after: 1 week
commands during controller initialization.
DELAY() does not work here during config_intrhook context - we need to
explicitly relinquish the CPU for the admin command completion to
get processed.
Sponsored by: Intel
Reported by: Adam Brooks <adam.j.brooks@intel.com>
Reviewed by: carl
MFC after: 3 days
and firmware revision in the controller's identify structure.
Also modify consumers of these fields to ensure they only use the
specified number of bytes for their respective fields.
Sponsored by: Intel
Reviewed by: carl
MFC after: 3 days
to Ethernet and the subsequent port being set to IB.
Submitted by: Shakar Klein @ Mellanox
Tested by: Morgan Robertson <morganrobertson@gmail.com>
MFC after: 1 week
The read DMA request logic operation is based on having sufficient
available space in the transmit data buffer (TXMBUF) before a read
DMA can be requested. There are four read DMA channels that use
the TXMBUF, and the logic checks if the available free space in the
TXMBUF is large enough for all the data in the four Send Buffers
for which buffer descriptors have been fetched. The Enable_Request
signal is asserted only if the free TXMBUF space is larger than the
sum of the four DMA length registers. The power-up default value
of BGE_RDMA_LSO_CRPTEN_CTRL register bit 25 (bit 21 on BCM5720) is
zero, which selects the DMA length registers to connect to the
input of the adder block. The DMA length registers are
asynchronously reset following BCM5719/BCM5720 power-up, and due to
the lack of synchronous deassertion of the length registers reset
signal these resisters may contain uninitialized values following
the reset deassertion.
In the case of the failure the uninitialized DMA length register
values added up to more than the TXMBUF size, which prevented the
assertion of the Enable_Request signal and any subsequent read DMA
to start. This lockup condition is the root cause of failing to
generate any transmit traffic.
To workaround the issue, select alternate output of multiplexers
and transmit the first four Ethernet frames. This overwrites the
DMA length registers with valid values.
Reported by: Geans Pin <geanspin@broadcom.com>
Reviewed by: Geans Pin <geanspin@broadcom.com>
constraint to 8. Previously it may have triggered watchdog
timeouts.
o Check whether interrupt is ours or not.
o Enable interrupts before attemping to transmit queued packets.
This will slightly improve TX performance.
o No need to clear IFF_DRV_OACTIVE in a loop. AE_FLAG_TXAVAIL is
used to know whether there are enough available TxD ring space.
o Added missing bus_dmamap_sync(9) in ae_rx_intr() and rearranged
code to avoid unncessary register access.
o Make sure to clear TxD, TxS, RxD rings in driver initialization.
Otherwise some data in these rings could be interpreted as
'updated' which in turn will advance internally maintained
pointers and can trigger watchdog timeouts.
PR: kern/180382
- We should check is_d32 to see howmany registers we have
- In vfp_restore mark vfpscr as an output register
Without the second part it appears we can return the incorrect value from
vfp_bounce if the VFP condition flags are set as it may override the
register holding the return value.
make the ARM EABI the default ABI on arm, armeb, armv6 and armv6eb.
This is intended to be the default ABI from now on with the old ABI to be
retired. Because of this all users are strongly suggested to upgrade to the
ARM EABI.
As the two ABIs are incompatible it is unlikely upgrading in place will
work. Users should perform a full backup and either use an external machine
to upgrade, or install to an alternative location on their media. They
should also reinstall all ports or packages when these are available.
The only known issues are:
- pkg incorrectly detects the ABI. This is fixed upstream, and will a
patch will be made to the port.
- GDB can have issues with executables built with clang.
__FreeBSD_version has been bumped.
settings for ACPI-enumerated serial ports by forcing any IRQs that use
an ISA IRQ value with these settings to active-high instead of active-low.
This is known to occur with the BIOS on an Intel D2500CCE motherboard.
Tested by: Robert Ames <robertames@hotmail.com>, lev
Submitted by: Juergen Weiss weiss at uni-mainz.de (original patch)
Now that r253351 moved sendfile() stats to a separate struct, the
last field used in mbstat is m_mcfail, which is updated, but never
read or obtained from userland.
Submitted by: adrian, zec
Fix multiple kernel panics when VIMAGE is enabled in the kernel.
These fixes are based on patches submitted by Adrian Chadd and Marko Zec.
(1) Set curthread->td_vnet to vnet0 in device_probe_and_attach() just before calling
device_attach(). This fixes multiple VIMAGE related kernel panics
when trying to attach Bluetooth or USB Ethernet devices because
curthread->td_vnet is NULL.
(2) Set curthread->td_vnet in if_detach(). This fixes kernel panics when detaching networking
interfaces, especially USB Ethernet devices.
(3) Use VNET_DOMAIN_SET() in ng_btsocket.c
(4) In ng_unref_node() set curthread->td_vnet. This fixes kernel panics
when detaching Netgraph nodes.
Bring in the changes from the FUSE kernel interface 7.10
(available under a BSD license).
After 7.10 the linux FUSE developers added support for a
controversial CUSE driver and some linux especific
features that are unlikely to find its way into FreeBSD.
We currently don't implement any of the new features so we
are *not* bumping the FUSE_KERNEL_MINOR_VERSION. The header
should, nevertheless, serve as a template to add the new
features in a compatible manner.
While here adopt some minor cleanups from the upstream version
like removing FUSE_MAJOR and FUSE_MINOR which were never
used. Also add multiple inclusion header guards,
We need to fix wpa_supplicant because it checks whether the card has
ic_cryptocaps set. Since net80211 can do software encryption this check in
wpa_supplicant is wrong.
"Gamers Keyboards" by adding a tunable, "hw.usb.ukbd.pollrate", which
can fix the polling rate of the attached USB keyboards in the range
1..1000Hz. A similar feature already exists in the USB mouse
driver. Use with care! Might leave you without keyboard input. This
feature is only available when the USB_DEBUG option is set in the
kernel configuration file.
Correct "unit" type to "int" while at it.
I was keeping this #ifdef'd for reference with the MacFUSE change[1]
but on second thought, this is a FreeBSD-only header so the SVN
history should be enough.
Add missing padding while here.
Reference [1]:
http://code.google.com/p/macfuse/source/detail?spec=svn1686&r=1360
a mailbox command and which registers to copy back in when
the command completes, the bits being set need to not only
specify what bits you want to add from the default from the
table but also what bits you want *subtract* (mask) from the
default from the table.
A failing ISP2200 command pointed this out.
Much appreciation to: marius, who persisted and narrowed down what
the failure delta was, and shamed me into actually fixing it.
MFC after: 1 week
function is leaf. The frame allows ddb to not loose the direct caller
of bcopy() in backtrace.
Other functions from support.s would benefit from the same change as
well, but for now bcopy() is the most frequent offender.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
"Logical unit not supported" errors. First initiates specific target rescan,
second -- destroys specific LUN. That allows to automatically detect changes
in list of device LUNs. This mechanism doesn't work when target is completely
idle, but probably that is all what can be done without active polling.
Reviewed by: ken
Sponsored by: iXsystems, Inc.
additions.
* Add some new tracing events to aid in debugging.
* Add in a debugging mode to drop transmit and received frames, specifically
to test whether seeing or hearing heartbeats correctly cause LACP to
drop the port.
* Add in (and make default) a strict LACP mode, which requires the
heartbeat on a port to be heard before it's used. Sometimes vendor ports
will hang but the link layer stays up, resulting in hung traffic.
* Add logging the number of link status flaps, again to aid in debugging
badly behaving switch ports.
* Calculate the lagg interface port speed as the multiple of the
configured ports, rather than the largest.
Obtained from: Netflix
MFC after: 2 weeks
ixgbe driver. As it was, when building them as a module INET
and INET6 are not defined. In these drivers it does not cause
a panic, however it does result in different behavior in the
ioctl routine when you are using a module vs static, and I
think the behavior should be the same.
MFC after: 3 days
MFprojects/camlock r248982:
Stop abusing xpt_periph in random plases that really have no periph related
to CCB, for example, bus scanning. NULL value is fine in such cases and it
is correctly logged in debug messages as "noperiph". If at some point we
need some real XPT periphs (alike to pmpX now), quite likely they will be
per-bus, and not a single global instance as xpt_periph now.
when building the driver as a module the result of the present
system results in INET and INET6 being undefined, and will cause
the panic in ixgbe_tso_setup(). The Makefile in the module directory
now renders the conditional in the source unnecessary and wrong.
MFC after: ASAP - the panic as a module must not get into 9.2
duplicated sockets a multicast address is bound and either
SO_REUSEPORT or SO_REUSEADDR is set.
But actually it works for the following combinations:
* SO_REUSEPORT is set for the fist socket and SO_REUSEPORT for the new;
* SO_REUSEADDR is set for the fist socket and SO_REUSEADDR for the new;
* SO_REUSEPORT is set for the fist socket and SO_REUSEADDR for the new;
and fails for this:
* SO_REUSEADDR is set for the fist socket and SO_REUSEPORT for the new.
Fix the last case.
PR: 179901
MFC after: 1 month
block copy, when copying the superblock into the snapshot. UFS1 does
not align superblock on the block boundary, and bcopy runs off the end
of the buffer.
Reported by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
changers that don't support the DVCID and CURDATA bits that were
introduced in the SMC spec.
These changers will return an Illegal Request type error if the
bits are set. This causes "chio status" to fail.
The fix is two-fold. First, for changers that claim to be SCSI-2
or older, don't set the DVCID and CURDATA bits for READ ELEMENT
STATUS. For newer changers (SCSI-3 and newer), we default to
setting the new bits, but back off and try the READ ELEMENT STATUS
without the bits if we get an Illegal Request type error.
This has been tested on a Qualstar TLS-8211, which is a SCSI-2
changer that does not support the new bits, and a Spectra T-380,
which is a SCSI-3 changer that does support the new bits. In the
absence of a SCSI-3 changer that does not support the bits, I
tested that with some error injection code. (The SMC spec says
that support for CURDATA is mandatory, and DVCID is optional.)
scsi_ch.c: Add a new quirk, CH_Q_NO_DVCID that gets set for
SCSI-2 and older libraries, or newer libraries that
report errors when the DVCID/CURDATA bits are set.
In chgetelemstatus(), use the new quirk to
determine whether or not to set DVCID and CURDATA.
If we get an error with the bits set, back off and
try without the bits. Set the quirk flag if the
read element status succeeds without the bits set.
Increase the READ ELEMENT STATUS timeout to 60
seconds after testing with a Spectra T-380. The
previous value was 10 seconds, and too short for
the T-380. This may be decreased later after
some additional testing and investigation.
Tested by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
Sponsored by: Spectra Logic
MFC after: 3 days
Submitted by: "YAMAMOTO, Shigeru" <shigeru@iij.ad.jp>
Reviewed by: adrian
In PC-BSD 9.1, VIMAGE is enabled in the kernel config.
For laptops with Bluetooth capability, such as the HP Elitebook 8460p,
the kernel will panic upon bootup, because curthread->td_vnet
is not initialized.
Properly initialize curthread->td_vnet when initializing the Bluetooth stack.
This allows laptops such as the HP Elitebook 8460p laptop
to properly boot with VIMAGE kernels.
to drain the reserve. This was broken in r243040, causing deadlock.
Note that VM_WAIT call in case of uma_zalloc() failure from pagedaemon
would only wait for the v_pageout_free_min anyway.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
information into the ISN (initial sequence number) without the additional
use of timestamp bits and switching to the very fast and cryptographically
strong SipHash-2-4 MAC hash algorithm to protect the SYN cookie against
forgeries.
The purpose of SYN cookies is to encode all necessary session state in
the 32 bits of our initial sequence number to avoid storing any information
locally in memory. This is especially important when under heavy spoofed
SYN attacks where we would either run out of memory or the syncache would
fill with bogus connection attempts swamping out legitimate connections.
The original SYN cookies method only stored an indexed MSS values in the
cookie. This isn't sufficient anymore and breaks down in the presence of
WSCALE information which is only exchanged during SYN and SYN-ACK. If we
can't keep track of it then we may severely underestimate the available
send or receive window. This is compounded with large windows whose size
information on the TCP segment header is even lower numerically. A number
of years back SYN cookies were extended to store the additional state in
the TCP timestamp fields, if available on a connection. While timestamps
are common among the BSD, Linux and other *nix systems Windows never enabled
them by default and thus are not present for the vast majority of clients
seen on the Internet.
The common parameters used on TCP sessions have changed quite a bit since
SYN cookies very invented some 17 years ago. Today we have a lot more
bandwidth available making the use window scaling almost mandatory. Also
SACK has become standard making recovering from packet loss much more
efficient.
This change moves all necessary information into the ISS removing the need
for timestamps. Both the MSS (16 bits) and send WSCALE (4 bits) are stored
in 3 bit indexed form together with a single bit for SACK. While this is
significantly less than the original range, it is sufficient to encode all
common values with minimal rounding.
The MSS depends on the MTU of the path and with the dominance of ethernet
the main value seen is around 1460 bytes. Encapsulations for DSL lines
and some other overheads reduce it by a few more bytes for many connections
seen. Rounding down to the next lower value in some cases isn't a problem
as we send only slightly more packets for the same amount of data.
The send WSCALE index is bit more tricky as rounding down under-estimates
the available send space available towards the remote host, however a small
number values dominate and are carefully selected again.
The receive WSCALE isn't encoded at all but recalculated based on the local
receive socket buffer size when a valid SYN cookie returns. A listen socket
buffer size is unlikely to change while active.
The index values for MSS and WSCALE are selected for minimal rounding errors
based on large traffic surveys. These values have to be periodically
validated against newer traffic surveys adjusting the arrays tcp_sc_msstab[]
and tcp_sc_wstab[] if necessary.
In addition the hash MAC to protect the SYN cookies is changed from MD5
to SipHash-2-4, a much faster and cryptographically secure algorithm.
Reviewed by: dwmalone
Tested by: Fabian Keil <fk@fabiankeil.de>
hash function) optimized for speed on short messages returning a 64bit hash/
digest value.
SipHash is simpler and much faster than other secure MACs and competitive
in speed with popular non-cryptographic hash functions. It uses a 128-bit
key without the hidden cost of a key expansion step. SipHash iterates a
simple round function consisting of four additions, four xors, and six
rotations, interleaved with xors of message blocks for a pre-defined number
of compression and finalization rounds. The absence of secret load/store
addresses or secret branch conditions avoid timing attacks. No state is
shared between messages. Hashing is deterministic and doesn't use nonces.
It is not susceptible to length extension attacks.
Target applications include network traffic authentication, message
authentication (MAC) and hash-tables protection against hash-flooding
denial-of-service attacks.
The number of update/finalization rounds is defined during initialization:
SipHash24_Init() for the fast and reasonable strong version.
SipHash48_Init() for the strong version (half as fast).
SipHash usage is similar to other hash functions:
struct SIPHASH_CTX ctx;
char *k = "16bytes long key"
char *s = "string";
uint64_t h = 0;
SipHash24_Init(&ctx);
SipHash_SetKey(&ctx, k);
SipHash_Update(&ctx, s, strlen(s));
SipHash_Final(&h, &ctx); /* or */
h = SipHash_End(&ctx); /* or */
h = SipHash24(&ctx, k, s, strlen(s));
It was designed by Jean-Philippe Aumasson and Daniel J. Bernstein and
is described in the paper "SipHash: a fast short-input PRF", 2012.09.18:
https://131002.net/siphash/siphash.pdf
Permanent ID: b9a943a805fbfc6fde808af9fc0ecdfa
Implemented by: andre (based on the paper)
Reviewed by: cperciva
is being wired now. The entry wired count is changed to non-zero in
advance, before the map lock is dropped. This makes the vm_fault() to
perceive the entry as wired, and breaks the fragment which moves the
wire count from the shadowed page, to the upper page, making the code
unwiring non-wired page.
On the other hand, the vm_fault() calls from vm_fault_wire() should be
allowed to proceed, so only drain MAP_ENTRY_IN_TRANSITION from
vm_fault() when wiring_thread is not current.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
parallel creation of the map entries, e.g. by mmap() or stack growing.
It also breaks when other entry is wired in parallel.
The vm_map_wire() iterates over the map entries in the region, and
assumes that map entries it finds are marked as in transition before,
also that any entry marked as in transition, are marked by the current
invocation of vm_map_wire(). This is not true for new entries in the
holes.
Add the thread owner of the MAP_ENTRY_IN_TRANSITION flag to struct
vm_map_entry. In vm_map_wire() and vm_map_unwire(), only process the
entries which transition owner is the current thread.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
msync(MS_INVALIDATE). The vm_fault_copy_entry() requires that object
range which corresponds to the user-wired vm_map_entry, is always
fully populated.
Add OBJPR_NOTWIRED flag for vm_object_page_remove() to request the
preserving behaviour, use it when calling vm_object_page_remove() from
vm_object_sync().
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
not busy, since its only caller brelse() can legitimately call it on
busy page. This happens for VOP_PUTPAGES() on filesystems that use
buffers and which VOP_WRITE() method marked the buffer containing page
as non-cacheable.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
error if any user wired mappings exist. Doing the invalidation
destroys the user wiring.
The change is the temporal measure to close the bug, the more proper
fix is to delegate the invalidation of the page to upper layers
always.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
processing. Thanks for John Baldwin for catching this. Not
clearing the flag member of the rxbuf could result in a NULL
mbuf pointer being used.
MFC after: 2 days (this needs to get into 9.2!)
isdir? ( fd -- bool )
freaddir ( fd -- ptr len TRUE | FALSE )
The 'isdir?' word returns `true' if the file descriptor is for a
directory and `false' otherwise.
The 'freaddir' word reads the next directory entry and if successful,
returns its name and 'true'. Otherwise 'false' is returned.
These words give the loader the ability to scan directories and read
files contained in them for 'rc.d'-like flexibility in handling which
modules to load and/or which tunables to set.
Obtained from: Juniper Networks, Inc.
H/W not de-asserting the interrupt at all. On x86, and because of the
following conditions, this results in a hard hang with interrupts disabled:
1. The uart(4) driver uses a spin lock to protect against concurrent
access to the H/W. Spin locks disable and restore interrupts.
2. Restoring the interrupt on x86 always writes the flags register. Even
if we're restoring the interrupt from disabled to disabled.
3. The x86 CPU has a short window in which interrupts are enabled when the
flags register is written.
4. The uart(4) driver registers a fast interrupt by default.
To catch this case, we first try to clear any pending H/W interrupts and in
particular, before setting up the interrupt. This makes sure the interrupt
is masked on the PIC. The interrupt handler now has a limit set on the
number of iterations it'll go through to clear interrupt conditions. If the
limit is hit, the handler will return FILTER_SCHEDULE_THREAD. The attach
function will check for this return code and avoid setting up the interrupt
and foce polling in that case.
Obtained from: Juniper Networks, Inc.
about mount and unmount events. This is used by Juniper to implement a more
optimal implementation of NetBSD's veriexec.
This change differs from r253224 in the following way:
o The vfs_mounted handler is called before mountcheckdirs() and with
newdp locked. vp is unlocked.
o The event handlers are declared in <sys/eventhandler.h> and not in
<sys/mount.h>. The <sys/mount.h> header is used in user land code
that pretends to be kernel code and as such creates a very convoluted
environment. It's hard to untangle.
Submitted by: stevek@juniper.net
Discussed with: pjd@
Obtained from: Juniper Networks, Inc.
don't declare a variable. The size before/after this change of the structs
doesn't change with gcc/clang.
Noticed by: several
Suggested by: Gary Jennejohn <gljennjohn@googlemail.com>
reuse as the pv chink page in reclaim_pv_chunk(). Having non-NULL
m->object is wrong for page not owned by an object and confuses both
vm_page_free_toq() and vm_page_remove() when the page is freed later.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
structure copying in random_ident_hardware(). This change will also help
further modularization of random(4) subsystem.
Submitted by: arthurmesh@gmail.com
Reviewed by: obrien
Obtained from: Juniper Networks
VMware up to at least ESXi 5.1. Actually, using INTx in that case instead
may still result in interrupt storms, with MSI being the only working
option in some configurations. So introduce a PCI_QUIRK_DISABLE_MSIX quirk
which only blacklists MSI-X but not also MSI and use it for the VMware
PCI-PCI-bridges. Note that, currently, we still assume that if MSI doesn't
work, MSI-X won't work either - but that's part of the internal logic and
not guaranteed as part of the API contract. While at it, add and employ
a pci_has_quirk() helper.
Reported and tested by: Paul Bucher
- Use NULL instead of 0 for pointers.
Submitted by: jhb (mostly)
Approved by: jhb
MFC after: 3 days
vfs_busy(mp);
vfs_write_suspend(mp);
which are problematic if other thread starts unmount between two
calls. The unmount starts a write, while vfs_write_suspend() drain
writers. On the other hand, unmount drains busy references, causing
the deadlock.
Add a flag argument to vfs_write_suspend and require the callers of it
to specify VS_SKIP_UNMOUNT flag, when the call is performed not in the
mount path, i.e. the covered vnode is not locked. The suspension is
not attempted if VS_SKIP_UNMOUNT is specified and unmount is in
progress.
Reported and tested by: Andreas Longwitz <longwitz@incore.de>
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
The distance between ticks and td_swvoltick should be calculated as
an unsigned number. Previously we could end up comparing a negative
number with hogticks in which case should_yield() would give incorrect
answer.
We should probably ensure that td_swvoltick is properly initialized.
Sponsored by: HybridCluster
MFC after: 5 days
Unconditionally freeing a page is not good, especially if it is the page
that was wired by the caller. The checks are picked up from
kern_sendfile.
MFC after: 3 weeks