This breaks cross-building with WITH_META_MODE since it will rebuild
'build-tools' during the 'everything' phase.
A more proper fix is coming to bmake to implicitly require .META unless
.NOMETA (and other restrictions) are in place.
Improvements after r301220.
Bus space methods are not called so simple pmap_mapdev will suffice.
Use OF_getencprop to get buffer with already converted endianess.
Pointed out by: ian
Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Reduce number of iterations used for calibrating ICR read loop. The
new number of iteration still gives the same ICR latency as before,
tested on Intel SandyBridge and Haswell machines, and on AMD. But it
significantly reduces the unneeded pause on boot in some VMs, from ~10
secs to less then 1 sec. It was reported to occur in bhyve on AMD
host.
Reported and tested by: avg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This is a followup to r300131.
A filesystem's root vnode can be reached not only through VSF_ROOT, but
by other means as well. For example, via a dot-dot lookup.
Also, a root vnode can get reclaimed and then re-created. For these
reasons it was insufficient to clear VV_ROOT flag from a root vnode of a
snapshot mounted under .zfs in zfsctl_snapdir_lookup().
So, now we set the flag in zfs_znode_sa_init() only if a vnode
represent a root of a filesystem or a standalone snapshot.
That is, the flag is not set for snapshots mounted under .zfs.
MFC after: 2 weeks
It could happen in an unlikely case that we fail to lock the root vnode
with requested flags (which appear to never include LK_NOWAIT).
MFC after: 1 week
Add accessor functions to toggle the state per VNET.
The base system (vnet0) will always enable itself with the normal
registration. We will share the registered protocol handlers in all
VNETs minimising duplication and management.
Upon disabling netisr processing for a VNET drain the netisr queue from
packets for that VNET.
Update netisr consumers to (de)register on a per-VNET start/teardown using
VNET_SYS(UN)INIT functionality.
The change should be transparent for non-VIMAGE kernels.
Reviewed by: gnn (, hiren)
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6691
The current error path in case of failure during attach/initialization is
not correct and leaves blkback in a stuck state. This is due to blkback
waiting for blkfront to switch to state XenbusStateClosed, but if blkfront
never attached (because the guest is not even started) it cannot possibly
make it to that state.
Instead just wait for the frontend to be in a state different than
XenbusStateConnected in order to proceed with the shutdown. Also, it is
wrong to call xbb_detach directly because it destroys the lock which can
still be used by xbb_frontend_changed.
Sponsored by: Citrix Systems R&D
Hotplug scripts are needed in order to use fancy disk configurations in xl,
like iSCSI disks. The job of hotplug scripts is to locally attach the disk
and present it to blkback as a block device or a regular file.
This change introduces a new xenstore node in the blkback hierarchy, called
"physical-device-path". This is a straigh replacement for the "params" node,
which was used before.
Hotplug scripts will need to read the "params" node, perform whatever
actions are necessary and then write the "physical-device-path" node. The
hotplug script is also in charge of detaching the disk once the domain has
been shutdown.
Sponsored by: Citrix Systems R&D
controller devices are attached. This has already been done for
bus_setup_intr().
There was no doubt that if someone wants to setup an interrupt,
corresponding interrupt controller device must already be attached.
However, the same must be valid for allocation of an interrupt resource
unless the allocation is done blindly, without any information that
such interrupt even exists. While it was done this blind way before,
it won't be possible after next INTRNG change.
framework has significantly changed the driver has moved to a new file.
While it shares some code with the existing driver this has been modified
to work better with the intrng framework.
This has been tested on the ThunderX servers in the netperf cluster and has
been used to boot them for other testing, including DTrace and hwpmc.
With this we can use intrng on all supported arm64 platforms I was able to
test on. It is expected we will move to intrng soon, and disable the old
arm64 interrupt framework.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6437
range of interrupts they pass to a second controller driver to handle.
The parent driver is expected to detect when one of these interrupts has
been triggered and call intr_child_irq_handler to pass the interrupt to
a child. The children controllers are then expected to manage the range
by allocating interrupts as needed.
This will initially be used by the ARM GICv3 driver, but is is expected to
be useful for other driver where this type of allocation applies.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6436
Replacing the bubble sort with insertion sort gives an 80% reduction
in runtime on average, with randomized keys, for small partitions.
If the keys are pre-sorted, insertion sort runs in linear time, and
even if the keys are reversed, insertion sort is faster than bubble
sort, although not by much.
Update comment describing "tcp_lro_sort()" while at it.
Differential Revision: https://reviews.freebsd.org/D6619
Sponsored by: Mellanox Technologies
Tested by: Netflix
Suggested by: Pieter de Goeje <pieter@degoeje.nl>
Reviewed by: ed, gallatin, gnn, transport
That check wasn't enough to handle appending a two byte character
following it.
This prevented my T400 (Intel Core 2 Duo P8400) from attaching;
it would panic from a stack overflow detection.
Only HMAC-SHA256 is added as it is the only SHA-2 variant supported by
cryptodev. It is not possible to register hardware support for other
algorithms in the family including regular non-keyed SHA256.
Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D6219
The output of HMAC was previously truncated to 12 bytes. This was only
correct in case of one particular crypto client - the new version of IPSEC.
Fix by taking into account the cri_mlen field in cryptoini session request
filled in by the client.
Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D6218
TDMA and CESA registers are placed in different ranges of memory. Split
memory resource in DTS to reflect that. This change is needed to support
multiple CESA nodes as otherwise the ranges of different nodes would
overlap.
In consequence, CESA_WRITE and CESA_READ macros have been split depending
on which range of registers is accessed. Offsets for CESA registers have
been modified as the base address has changed.
Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D6217
Check if there is a second CESA SRAM node in FDT and add a CPU window
for it. Define A38X specific macro for setting device attribute for
each node.
Submitted by: Michal Stanek <mst@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D6216
On other platforms with CESA accelerator the SRAM memory is mapped in
early init before driver is attached. This method only works correctly
with mappings no smaller than L1 section size (1MB). There may be more
SRAM blocks and they may have smaller sizes than 1MB as is the case
for Armada38x. Instead, map SRAM memory with bus_space_map() in CESA
driver attach. Note that we can no longer assume that VA == PA for the
SRAM.
Submitted by: Michal Stanek <mst@semihalf.com
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D6215
Commit was temporary fix due to rman_res_t defined as 32-bit u_long.
After redefining it as 64-bit variable workaround is not needed and
was removed.
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D6214
but removed due to other changes in the system. Restore the llentry pointer
to the "struct route", and use it to cache the L2 lookup (ARP or ND6) as
appropriate.
Submitted by: Mike Karels
Differential Revision: https://reviews.freebsd.org/D6262
Otherwise we transmit the first neighbour solicitation in the context of the
caller of nd6_dad_start(), which can easily result in lock recursion. When
DAD is to be started after some delay, we send the first NS from the DAD
callout handler, so just change the implementation to do this in the
non-delayed case as well.
Reviewed by: ae, hrs
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6639
Per the KASSERT at the beginning of the function, we expect that the page
does not belong to any object, so its object and pindex fields are
meaningless. Reset them in the rare case that vm_radix_insert() fails.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6669
The PV backend will only pick the new options when the interface is detached
and reattached again, so perform a full reset when changing options. This is
very fast, and should not be noticeable by the user.
Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6658
Just calling gnttab_end_foreign_access_ref doesn't free the references,
instead call gnttab_end_foreign_access with a NULL page argument in order to
have the grant references freed. The code that maps the ring
(xenbus_map_ring) already uses gnttab_grant_foreign_access which takes care
of allocating a grant reference.
Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6608
This patch fixes two issues seen on hot-unplug. The first one is a panic
caused by calling ether_ifdetach after freeing the internal netfront queue
structures. ether_ifdetach will call xn_qflush, and this needs to be done
before freeing the queues. This prevents the following panic:
Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 04
instruction pointer = 0x20:0xffffffff80b1687f
stack pointer = 0x28:0xfffffe009239e770
frame pointer = 0x28:0xfffffe009239e780
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (thread taskq)
[ thread pid 0 tid 100015 ]
Stopped at strlen+0x1f: movq (%rcx),%rax
db> bt
Tracing pid 0 tid 100015 td 0xfffff800038a6000
strlen() at strlen+0x1f/frame 0xfffffe009239e780
kvprintf() at kvprintf+0xfa0/frame 0xfffffe009239e890
vsnprintf() at vsnprintf+0x31/frame 0xfffffe009239e8b0
kassert_panic() at kassert_panic+0x5a/frame 0xfffffe009239e920
__mtx_lock_flags() at __mtx_lock_flags+0x164/frame 0xfffffe009239e970
xn_qflush() at xn_qflush+0x59/frame 0xfffffe009239e9b0
if_detach() at if_detach+0x17e/frame 0xfffffe009239ea10
netif_free() at netif_free+0x97/frame 0xfffffe009239ea30
netfront_detach() at netfront_detach+0x11/frame 0xfffffe009239ea40
[...]
Another panic can be triggered by hot-plugging a NIC:
Fatal trap 18: integer divide fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0xffffffff80902203
stack pointer = 0x28:0xfffffe00508d3660
frame pointer = 0x28:0xfffffe00508d36a0
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 2960 (ifconfig)
[ thread pid 2960 tid 100088 ]
Stopped at xn_txq_mq_start+0x33: divl %esi,%eax
db> bt
Tracing pid 2960 tid 100088 td 0xfffff8000850aa00
xn_txq_mq_start() at xn_txq_mq_start+0x33/frame 0xfffffe00508d36a0
ether_output() at ether_output+0x570/frame 0xfffffe00508d3720
arprequest() at arprequest+0x433/frame 0xfffffe00508d3820
arp_ifinit() at arp_ifinit+0x49/frame 0xfffffe00508d3850
xn_ioctl() at xn_ioctl+0x1a2/frame 0xfffffe00508d3890
in_control() at in_control+0x882/frame 0xfffffe00508d3910
ifioctl() at ifioctl+0xda1/frame 0xfffffe00508d39a0
kern_ioctl() at kern_ioctl+0x246/frame 0xfffffe00508d3a00
sys_ioctl() at sys_ioctl+0x171/frame 0xfffffe00508d3ae0
amd64_syscall() at amd64_syscall+0x2db/frame 0xfffffe00508d3bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00508d3bf0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8011e185a, rsp =
0x7fffffffe478, rbp = 0x7fffffffe4c0 ---
This is caused by marking the driver as active before it's fully
initialized, and thus calling xn_txq_mq_start with num_queues set to 0.
Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D6646
In order to use custom taskqueues we would have to mask the interrupt, which
is basically what is already done for an interrupt handler, or else we risk
loosing interrupts. This switches netfront to the same interrupt handling
that was done before multiqueue support was added.
Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
This is based on Linux commit 1f3c2eba1e2d866ef99bb9b10ade4096e3d7607c from
David Vrabel:
A full Rx ring only requires 1 MiB of memory. This is not enough memory
that it is useful to dynamically scale the number of Rx requests in the ring
based on traffic rates, because:
a) Even the full 1 MiB is a tiny fraction of a typically modern Linux
VM (for example, the AWS micro instance still has 1 GiB of memory).
b) Netfront would have used up to 1 MiB already even with moderate
data rates (there was no adjustment of target based on memory
pressure).
c) Small VMs are going to typically have one VCPU and hence only one
queue.
Keeping the ring full of Rx requests handles bursty traffic better than
trying to converge on an optimal number of requests to keep filled.
Reviewed by: Wei Liu <wei.liu2@citrix.com>
Sponsored by: Citrix Systems R&D
Currently FreeBSD is not properly fetching the TSO information from the Xen
PV ring, and thus the received packets didn't have all the necessary
information, like the segment size or even the TSO flag set.
Sponsored by: Citrix Systems R&D
* The "if (!data->valid_tx_ant || !data->valid_rx_ant) {" check was getting
triggered with a 3165 chipset.
Submitted by: Imre Vadasz <imre@vdsz.com>
Obtained from: DragonflyBSD 3655dfb6fc311fc83e5ce8370dd91b4cd4a37991