taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167. It has been a compat shim ever
since. It's time for the compat shim to go.
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: sephe
Differential Revision: https://reviews.freebsd.org/D5131
xn_ifp is allocated in create_netdev with if_alloc(IFT_ETHER).
According to the current arrangement it can't be NULL.
Coverity ID: 1349805
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5252
The variable error is assigned to 0 before entering the switch.
Assigning error to 0 before break pointless rewrites the real error
value that should be returned.
Coverity ID: 1304974
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D5250
Add support for multiple TX and RX queue pairs. The default number of queues
is set to 4, but can be easily changed from the sysctl node hw.xn.num_queues.
Also heavily refactor netfront driver: break out a bunch of helper
functions and different structures. Use threads to handle TX and RX.
Remove some dead code and fix quite a few bugs as I go along.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D4193
r289587 broke LINT-NOIP kernels because the lro and queued local variables
are defined but not used. Add preprocessor guards around them.
Reported by: emaste
Sponsored by: Citrix Systems R&D
xen/hypervisor.h:
- Remove unused helpers: MULTI_update_va_mapping, is_initial_xendomain,
is_running_on_xen
- Remove unused define CONFIG_X86_PAE
- Remove unused variable xen_start_info: note that it's used inpcifront
which is not built at all
- Remove forward declaration of HYPERVISOR_crash
xen/xen-os.h:
- Remove unused define CONFIG_X86_PAE
- Drop unused helpers: test_and_clear_bit, clear_bit,
force_evtchn_callback
- Implement a generic version (based on ofed/include/linux/bitops.h) of
set_bit and test_bit and prefix them by xen_ to avoid any use by other
code than Xen. Note that It would be worth to investigate a generic
implementation in FreeBSD.
- Replace barrier() by __compiler_membar()
- Replace cpu_relax() by cpu_spinwait(): it's exactly the same as rep;nop
= pause
xen/xen_intr.h:
- Move the prototype of xen_intr_handle_upcall in it: Use by all the
platform
x86/xen/xen_intr.c:
- Use BITSET* for the enabledbits: Avoid to use custom helpers
- test_bit/set_bit has been renamed to xen_test_bit/xen_set_bit
- Don't export the variable xen_intr_pcpu
dev/xen/blkback/blkback.c:
- Fix the string format when XBB_DEBUG is enabled: host_addr is typed
uint64_t
dev/xen/balloon/balloon.c:
- Remove set but not used variable
- Use the correct type for frame_list: xen_pfn_t represents the frame
number on any architecture
dev/xen/control/control.c:
- Return BUS_PROBE_WILDCARD in xs_probe: Returning 0 in a probe callback
means the driver can handle this device. If by any chance xenstore is the
first driver, every new device with the driver is unset will use
xenstore.
dev/xen/grant-table/grant_table.c:
- Remove unused cmpxchg
- Drop unused include opt_pmap.h: Doesn't exist on ARM64 and it doesn't
contain anything required for the code on x86
dev/xen/netfront/netfront.c:
- Use the correct type for rx_pfn_array: xen_pfn_t represents the frame
number on any architecture
dev/xen/netback/netback.c:
- Use the correct type for gmfn: xen_pfn_t represents the frame number on
any architecture
dev/xen/xenstore/xenstore.c:
- Return BUS_PROBE_WILDCARD in xctrl_probe: Returning 0 in a probe callback
means the driver can handle this device. If by any chance xenstore is the
first driver, every new device with the driver is unset will use xenstore.
Note that with the changes, x86/include/xen/xen-os.h doesn't contain anymore
arch-specific code. Although, a new series will add some helpers that differ
between x86 and ARM64, so I've kept the headers for now.
Submitted by: Julien Grall <julien.grall@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3921
Sponsored by: Citrix Systems R&D
The failure path for allocating rx grant refs should not try to free tx
grant refs because tx grant refs were allocated after that. Also fix the
error path for xen_net_read_mac.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3891
Sponsored by: Citrix Systems R&D
This is redundant because ether_ifattach will set that field.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3918
Sponsored by: Citrix Systems R&D
We're way beyond FreeBSD 7 at this point.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3892
Sponsored by: Citrix Systems R&D
Multiqueue feature will make the number of queues dynamic, so XN_LOCK_INIT
won't be that useful. Remove the macro and call mtx_init directly.
XN_LOCK_DESTROY is just dead code.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3890
Sponsored by: Citrix Systems R&D
Rename it with netfront_ prefix and purge a bunch of unused fields.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3889
Sponsored by: Citrix Systems R&D
Currently neither Linux nor FreeBSD netback supports page flipping. NetBSD
still supports that. It is not sure how many people actually use page
flipping, but page flipping is supposed to be slower than copying nowadays.
It will also shatter frontend / backend address space.
Overall this feature is more of a burden than a benefit.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3888
Sponsored by: Citrix Systems R&D
Introduce two new loader tunnables that can be used to disable PV disks and
PV nics at boot time. They default to 0 and should be set to 1 (or any
number different than 0) in order to disable the PV devices:
hw.xen.disable_pv_disks=1
hw.xen.disable_pv_nics=1
In /boot/loader.conf will disable both PV disks and nics.
Sponsored by: Citrix Systems R&D
Tested by: Karl Pielorz <kpielorz_lst@tdx.co.uk>
MFC after: 1 week
use vtophys() directly instead of vtomach() and retire the no-longer-used
headers <machine/xenfunc.h> and <machine/xenvar.h>.
Reported by: bde (stale bits in <machine/xenfunc.h>)
Reviewed by: royger (earlier version)
Differential Revision: https://reviews.freebsd.org/D3266
Try to preserve the xn configuration when migrating. This is not always
possible since the backend might not have the same set of options
available, in which case we will try to preserve as many as possible.
MFC after: 2 weeks
PR: 183139
Reported by: mcdouga9@egr.msu.edu
Sponsored by: Citrix Systems R&D
years for head. However, it is continuously misused as the mpsafe argument
for callout_init(9). Deprecate the flag and clean up callout_init() calls
to make them more consistent.
Differential Revision: https://reviews.freebsd.org/D2613
Reviewed by: jhb
MFC after: 2 weeks
Netfront has to wait for the backend to switch to state XenbusStateConnected
before sending the ARP request, or else the backend might not be connected
and thus the packet will be lost.
Sponsored by: Citrix Systems R&D
MFC after: 1 week
remains. Xen is planning to phase out support for PV upstream since it
is harder to maintain and has more overhead. Modern x86 CPUs include
virtualization extensions that support HVM guests instead of PV guests.
In addition, the PV code was i386 only and not as well maintained recently
as the HVM code.
- Remove the i386-only NATIVE option that was used to disable certain
components for PV kernels. These components are now standard as they
are on amd64.
- Remove !XENHVM bits from PV drivers.
- Remove various shims required for XEN (e.g. PT_UPDATES_FLUSH, LOAD_CR3,
etc.)
- Remove duplicate copy of <xen/features.h>.
- Remove unused, i386-only xenstored.h.
Differential Revision: https://reviews.freebsd.org/D2362
Reviewed by: royger
Tested by: royger (i386/amd64 HVM domU and amd64 PVH dom0)
Relnotes: yes
Before this change, the current code handles SIOCGIFADDR the same
way with SIOCSIFADDR, which involves full arp_ifinit, et al. They
should be unnecessary for SIOCGIFADDR case.
Differential Revision: https://reviews.freebsd.org/D1508
Reviewed by: glebius
MFC after: 2 weeks
socket-buffer implementations, introduce a return value for MCLGET()
(and m_cljget() that underlies it) to allow the caller to avoid testing
M_EXT itself. Update all callers to use the return value.
With this change, very few network device drivers remain aware of
M_EXT; the primary exceptions lie in mbuf-chain pretty printers for
debugging, and in a few cases, custom mbuf and cluster allocation
implementations.
NB: This is a difficult-to-test change as it touches many drivers for
which I don't have physical devices. Instead we've gone for intensive
review, but further post-commit review would definitely be appreciated
to spot errors where changes could not easily be made mechanically,
but were largely mechanical in nature.
Differential Revision: https://reviews.freebsd.org/D1440
Reviewed by: adrian, bz, gnn
Sponsored by: EMC / Isilon Storage Division
- Wrong integer type was specified.
- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.
- Logical OR where binary OR was expected.
- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.
- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.
- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.
- Updated "EXAMPLES" section in SYSCTL manual page.
MFC after: 3 days
Sponsored by: Mellanox Technologies
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
Reviewed by: adrian, rmacklem
Sponsored by: Mellanox Technologies
MFC after: 1 week
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Re-structure Xen HVM support so that:
- Xen is detected and hypercalls can be performed very
early in system startup.
- Xen interrupt services are implemented using FreeBSD's native
interrupt delivery infrastructure.
- the Xen interrupt service implementation is shared between PV
and HVM guests.
- Xen interrupt handlers can optionally use a filter handler
in order to avoid the overhead of dispatch to an interrupt
thread.
- interrupt load can be distributed among all available CPUs.
- the overhead of accessing the emulated local and I/O apics
on HVM is removed for event channel port events.
- a similar optimization can eventually, and fairly easily,
be used to optimize MSI.
Early Xen detection, HVM refactoring, PVHVM interrupt infrastructure,
and misc Xen cleanups:
Sponsored by: Spectra Logic Corporation
Unification of PV & HVM interrupt infrastructure, bug fixes,
and misc Xen cleanups:
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
sys/x86/x86/local_apic.c:
sys/amd64/include/apicvar.h:
sys/i386/include/apicvar.h:
sys/amd64/amd64/apic_vector.S:
sys/i386/i386/apic_vector.s:
sys/amd64/amd64/machdep.c:
sys/i386/i386/machdep.c:
sys/i386/xen/exception.s:
sys/x86/include/segments.h:
Reserve IDT vector 0x93 for the Xen event channel upcall
interrupt handler. On Hypervisors that support the direct
vector callback feature, we can request that this vector be
called directly by an injected HVM interrupt event, instead
of a simulated PCI interrupt on the Xen platform PCI device.
This avoids all of the overhead of dealing with the emulated
I/O APIC and local APIC. It also means that the Hypervisor
can inject these events on any CPU, allowing upcalls for
different ports to be handled in parallel.
sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
Map Xen per-vcpu area during AP startup.
sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
Increase the FreeBSD IRQ vector table to include space
for event channel interrupt sources.
sys/amd64/include/pcpu.h:
sys/i386/include/pcpu.h:
Remove Xen HVM per-cpu variable data. These fields are now
allocated via the dynamic per-cpu scheme. See xen_intr.c
for details.
sys/amd64/include/xen/hypercall.h:
sys/dev/xen/blkback/blkback.c:
sys/i386/include/xen/xenvar.h:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/xen/gnttab.c:
Prefer FreeBSD primatives to Linux ones in Xen support code.
sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
sys/dev/xen/balloon/balloon.c:
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/console/xencons_ring.c:
sys/dev/xen/control/control.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/xenpci/xenpci.c:
sys/i386/i386/machdep.c:
sys/i386/include/pmap.h:
sys/i386/include/xen/xenfunc.h:
sys/i386/isa/npx.c:
sys/i386/xen/clock.c:
sys/i386/xen/mp_machdep.c:
sys/i386/xen/mptable.c:
sys/i386/xen/xen_clock_util.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/xen_rtc.c:
sys/xen/evtchn/evtchn_dev.c:
sys/xen/features.c:
sys/xen/gnttab.c:
sys/xen/gnttab.h:
sys/xen/hvm.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstore.c:
sys/xen/xenstore/xenstore_dev.c:
sys/xen/xenstore/xenstorevar.h:
Pull common Xen OS support functions/settings into xen/xen-os.h.
sys/amd64/include/xen/xen-os.h:
sys/i386/include/xen/xen-os.h:
sys/xen/xen-os.h:
Remove constants, macros, and functions unused in FreeBSD's Xen
support.
sys/xen/xen-os.h:
sys/i386/xen/xen_machdep.c:
sys/x86/xen/hvm.c:
Introduce new functions xen_domain(), xen_pv_domain(), and
xen_hvm_domain(). These are used in favor of #ifdefs so that
FreeBSD can dynamically detect and adapt to the presence of
a hypervisor. The goal is to have an HVM optimized GENERIC,
but more is necessary before this is possible.
sys/amd64/amd64/machdep.c:
sys/dev/xen/xenpci/xenpcivar.h:
sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/sys/kernel.h:
Refactor magic ioport, Hypercall table and Hypervisor shared
information page setup, and move it to a dedicated HVM support
module.
HVM mode initialization is now triggered during the
SI_SUB_HYPERVISOR phase of system startup. This currently
occurs just after the kernel VM is fully setup which is
just enough infrastructure to allow the hypercall table
and shared info page to be properly mapped.
sys/xen/hvm.h:
sys/x86/xen/hvm.c:
Add definitions and a method for configuring Hypervisor event
delievery via a direct vector callback.
sys/amd64/include/xen/xen-os.h:
sys/x86/xen/hvm.c:
sys/conf/files:
sys/conf/files.amd64:
sys/conf/files.i386:
Adjust kernel build to reflect the refactoring of early
Xen startup code and Xen interrupt services.
sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
sys/dev/xen/control/control.c:
sys/dev/xen/evtchn/evtchn_dev.c:
sys/dev/xen/netback/netback.c:
sys/dev/xen/netfront/netfront.c:
sys/xen/xenstore/xenstore.c:
sys/xen/evtchn/evtchn_dev.c:
sys/dev/xen/console/console.c:
sys/dev/xen/console/xencons_ring.c
Adjust drivers to use new xen_intr_*() API.
sys/dev/xen/blkback/blkback.c:
Since blkback defers all event handling to a taskqueue,
convert this task queue to a "fast" taskqueue, and schedule
it via an interrupt filter. This avoids an unnecessary
ithread context switch.
sys/xen/xenstore/xenstore.c:
The xenstore driver is MPSAFE. Indicate as much when
registering its interrupt handler.
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusvar.h:
Remove unused event channel APIs.
sys/xen/evtchn.h:
Remove all kernel Xen interrupt service API definitions
from this file. It is now only used for structure and
ioctl definitions related to the event channel userland
device driver.
Update the definitions in this file to match those from
NetBSD. Implementing this interface will be necessary for
Dom0 support.
sys/xen/evtchn/evtchnvar.h:
Add a header file for implemenation internal APIs related
to managing event channels event delivery. This is used
to allow, for example, the event channel userland device
driver to access low-level routines that typical kernel
consumers of event channel services should never access.
sys/xen/interface/event_channel.h:
sys/xen/xen_intr.h:
Standardize on the evtchn_port_t type for referring to
an event channel port id. In order to prevent low-level
event channel APIs from leaking to kernel consumers who
should not have access to this data, the type is defined
twice: Once in the Xen provided event_channel.h, and again
in xen/xen_intr.h. The double declaration is protected by
__XEN_EVTCHN_PORT_DEFINED__ to ensure it is never declared
twice within a given compilation unit.
sys/xen/xen_intr.h:
sys/xen/evtchn/evtchn.c:
sys/x86/xen/xen_intr.c:
sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/xenpci/xenpcivar.h:
New implementation of Xen interrupt services. This is
similar in many respects to the i386 PV implementation with
the exception that events for bound to event channel ports
(i.e. not IPI, virtual IRQ, or physical IRQ) are further
optimized to avoid mask/unmask operations that aren't
necessary for these edge triggered events.
Stubs exist for supporting physical IRQ binding, but will
need additional work before this implementation can be
fully shared between PV and HVM.
sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
sys/i386/xen/mp_machdep.c
sys/x86/xen/hvm.c:
Add support for placing vcpu_info into an arbritary memory
page instead of using HYPERVISOR_shared_info->vcpu_info.
This allows the creation of domains with more than 32 vcpus.
sys/i386/i386/machdep.c:
sys/i386/xen/clock.c:
sys/i386/xen/xen_machdep.c:
sys/i386/xen/exception.s:
Add support for new event channle implementation.
In netif_free(), call ifmedia_removeall() after ether_ifdetach()
so that bpf listeners are detached, any link state processing
is completed, and there is no chance for external reference to media
information.
Suggested by: yongari
MFC after: 1 week
Xen host side can handle after defragmentation.
This prevents the driver from throwing away too long TSO chains and
improves the performance on Amazon AWS instances with 10GigE virtual
interfaces to the normally expected throughput.
Submitted by: cperciva (earlier version)
Reviewed by: cperciva
Tested by: cperciva
MFC after: 1 week
dev/xen/netfront:
In netif_free(), properly stop the interface and drain any pending
timers prior to disconnecting from the backend device.
Remove all media and detach our interface object from the system
prior to deleting it.
PR: kern/176471
Submitted by: Roger Pau Monne <roger.pau@citrix.com>
Reviewed by: gibbs
MFC after: 1 week
__func__ and add some missing ones.
- Remove a stale comment.
- Remove unused NUM_ELEMENTS macro.
- Remove extra empty lines.
- Use DEVMETHOD_END.
- Use NULL rather than 0 for pointers.
MFC after: 3 days
back-end features.
sys/dev/xen/netfront/netfront.c:
o Add xn_query_features() which reads the XenStore and
records the TSO, LRO, and chained ring-request support
of the backend.
o Rename xn_configure_lro() to xn_configure_features() and
use this routine to manage the setup of TSO, LRO, and
checksum offload.
o In create_netdev(), initialize if_capabilities and
if_hwassist to the capabilities found on all backends.
Delegate configuration of if_capenable and the TSO flag
if if_hwassist to xn_configure_features().
Reported by: Hugo Silva (fix inspired by patch provided)
Approved by: re
MFC after: 1 week
PV devices with the ioemu attribute set.
sys/dev/xen/netfront/netfront.c:
o If a mac address for the interface cannot be found
in the front-side XenStore tree, look for an entry
in the back-side tree. With ioemu devices, the
emulator does not populate the front side tree and
neither does Xend.
o Return an error rather than panic when an attach
attempt fails.
Reported by: Janne Snabb (fix inspired by patch provided)
PR: kern/154302
Approved by: re