Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment. Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.
ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks. Unfortunately, not everything can be
fixed, especially outside the base system. For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.
Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.
Struct xvnode changed layout, no compat shims are provided.
For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.
Update note: strictly follow the instructions in UPDATING. Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.
Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver. Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).
Sponsored by: The FreeBSD Foundation (emaste, kib)
Differential revision: https://reviews.freebsd.org/D10439
ENA is a networking interface designed to make good use of modern CPU
features and system architectures.
The ENA device exposes a lightweight management interface with a
minimal set of memory mapped registers and extendable command set
through an Admin Queue.
The driver supports a range of ENA devices, is link-speed independent
(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
a negotiated and extendable feature set.
Some ENA devices support SR-IOV. This driver is used for both the
SR-IOV Physical Function (PF) and Virtual Function (VF) devices.
ENA devices enable high speed and low overhead network traffic
processing by providing multiple Tx/Rx queue pairs (the maximum number
is advertised by the device via the Admin Queue), a dedicated MSI-X
interrupt vector per Tx/Rx queue pair, and CPU cacheline optimized
data placement.
The ENA driver supports industry standard TCP/IP offload features such
as checksum offload and TCP transmit segmentation offload (TSO).
Receive-side scaling (RSS) is supported for multi-core scaling.
The ENA driver and its corresponding devices implement health
monitoring mechanisms such as watchdog, enabling the device and driver
to recover in a manner transparent to the application, as well as
debug logs.
Some of the ENA devices support a working mode called Low-latency
Queue (LLQ), which saves several more microseconds. This feature will
be implemented for driver in future releases.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Jakub Palider <jpa@semihalf.com>
Jan Medala <jan@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
Differential revision: https://reviews.freebsd.org/D10427
Since netfront uses different locks for the RX and TX paths there's no need to
drop the RX lock before calling if_input.
Suggested by: jhb
Tested by: cperciva
Sponsored by: Citrix Systems R&D
MFC with: r318523
When creating EQs to handle CQ completion events for the PF or for
VFs, we create enough EQE entries to handle completions for the max
number of CQs that can use that EQ.
When SRIOV is activated, the max number of CQs a VF (or the PF) can
obtain is its CQ quota (determined by the Hypervisor resource
tracker). Therefore, when creating an EQ, the number of EQE entries
that the VF should request for that EQ is the CQ quota value (and not
the total number of CQs available in the firmware).
Under SRIOV, the PF, also must use its CQ quota, because the resource
tracker also controls how many CQs the PF can obtain.
Using the firmware total CQs instead of the CQ quota when creating EQs
resulted wasting MTT entries, due to allocating more EQEs than were
needed.
MFC after: 3 days
Sponsored by: Mellanox Technologies
e6000sw family automatically reflects PHY status in each port's registers.
Therefore it is not necessary to do a full PHY polling squence, which
results in much quicker operation and much less significant usage of
the SMI bus.
Care must be taken that the resulting ifmedia_active is identical to
what the PHY will compute, or gratuitous link status changes will
occur whenever the PHYs update function is called.
This patch implements above improvement. On the occasion set a pointer to
the proc structure to be part of software context instead of being
a global variable.
Submitted by: Marcin Wojtas <mw@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: loos
Differential revision: https://reviews.freebsd.org/D10714
Make sure the RX ring lock is only released when the state of the ring is
consistent, or else concurrent calls to xn_rxeof might get an inconsistent ring
state and thus some packets might be processed twice.
Note that this is not very common, and could only happen when an interrupt is
delivered while in xn_ifinit.
Reported by: cperciva
Tested by: cperciva
MFC after: 1 week
Sponsored by: Citrix Systems R&D
For all Marvell devices, MBUS windows configuration is done
in a common place. Only CESA was an exception, so move its
related code from driver to mv_common.c. This way it uses
same proper DRAM information, same as all other interfaces
instead of parsing DT /memory node directly.
Submitted by: Marcin Wojtas <mw@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: loos
Differential revision: https://reviews.freebsd.org/D10723
Hitherto implementation of PHY polling resulted in a risk of an
endless loop and very high occupation of the SMI bus. Improve the
operation by limiting the polling tries and adding sleepable
pause.
Submitted by: Marcin Wojtas <mw@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: loos
Differential revision: https://reviews.freebsd.org/D10713
Call disk_gone when the backend switches to the "Closing" state and blkfront
still has pending users. This allows the disk to be detached, and will call
into xbd_closing by itself when the geom layout cleanup has finished.
Reported by: bapt
Tested by: manu
Reviewed by: bapt
Sponsored by: Citrix Systems R&D
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D10772
kern_yield(0) effectively causes the calling thread to be rescheduled
immediately since it resets the thread's priority to the highest possible
value. This can cause livelocks when the pattern
"while (!trylock()) kern_yield(0);" is used since the thread holding the
lock may linger on the runqueue for the CPU on which the looping thread is
running.
MFC after: 1 week
The ccr(4) driver supports use of the crypto accelerator engine on
Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.
Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS
cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC,
and SHA2-512-HMAC authentication algorithms. The driver also supports
chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication
algorithm for encrypt-then-authenticate operations.
Note that this driver is still under active development and testing and
may not yet be ready for production use. It does pass the tests in
tests/sys/opencrypto with the exception that the AES-GCM implementation
in the driver does not yet support requests with a zero byte payload.
To use this driver currently, the "uwire" configuration must be used
along with explicitly enabling support for lookaside crypto capabilities
in the cxgbe(4) driver. These can be done by setting the following
tunables before loading the cxgbe(4) driver:
hw.cxgbe.config_file=uwire
hw.cxgbe.cryptocaps_allowed=-1
MFC after: 1 month
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D10763
This includes NVMe device support and adds support for the following adapters:
SAS 3408
SAS 3416
SAS 3508
SAS 3516
SAS 3616
SAS 3708
SAS 3716
Reviewed by: ken, scottl, asomers, mav
Approved by: ken, scottl, mav
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D10095
Malloc should always return something when M_WAITOK flag is used,
but keep this code and change flag to M_NOWAIT as it is under a lock
(allows for possible future change). Free ifnet structure to avoid
memory leak on failure.
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: loos
Differential revision: https://reviews.freebsd.org/D10711
Experimentally we know this value works, but the hardware
may support an even higher value.
PR: 213876
Reported by: J.Catrysse@proximedia.be
MFC after: 1 week
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.
ANSIfy related prototypes while here.
Reviewed by: cem, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D10193
Invoke any identify routines of child drivers during attach before attaching
children, and delete any remaining devices after deleting ports.
MFC after: 1 month
Sponsored by: Chelsio Communications
2. Use sysctls for TRACE_LRO_CNT and TRACE_TSO_PKT_LEN
3. remove unused mtx tx_lock
4. bind taskqueue kernel thread to the appropriate cpu core
5. when tx_ring is full, stop further transmits till at least 1/16th of the Tx Ring is empty. In our case 1K entries. Also if there are rx_pkts to process, put the taskqueue thread to sleep for 100ms, before enabling interrupts.
6. Use rx_pkt_threshold of 128.
MFC after:3 days
sdhci_fdt.
Enable the SDHCI controller, bus and devices on ARMADA38X kernel.
Tested on: ClearFog Pro
Reviewed by: Marcin Wojtas <mw at semihalf.com>
Sponsored by: Rubicon Communications, LLC (Netgate)
Differential Revision: https://reviews.freebsd.org/D10606
Two blocks in e1000_ich8lan.c are misaligned, causing noise with some
compilers (gcc 6).
Reviewed by: imp, erj
Differential Revision: https://reviews.freebsd.org/D10741
and Braswell eMMC and SDXC controllers share the same IDs. Like in
the PCI case, Braswell eMMC needs the SDHCI_QUIRK_DATA_TIMEOUT_1MHZ
quirk (see r311794 for the corresponding change to the sdhci(4) PCI
PCI front-end), though. However, due to the shared ACPI IDs, this
is trickier to do.
- Intel Apollo Lake eMMC and SDXC controllers are affected by the
APL18 ("Using 32-bit Addressing Mode With SD/eMMC Controller May
Lead to Unpredictable System Behavior") silicon bug [1]. When this
erratum hits, typically both SDHCI and XHCI controllers wedge.
According to Intel, using ADMA2 with 64-bit addressing and 96-bit
descriptors serves as a workaround. Until such times when sdhci(4)
has ADMA2 support, flag DMA as broken for affected interfaces.
This turns out to work around the problem, too, at the cost of
performance.
- In the sdhci(4) ACPI front-end, probe the Intel Apollo Lake eMMC
and SDXC controllers, too.
1: http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/pentium-celeron-n-series-j-series-datasheet-spec-update.pdf
* This fixes cases where the group id of wide commands got lost, e.g. this
happened to the IWM_SCAN_ABORT_UMAC command.
Obtained from: dragonflybsd.git 71310fab0caca79bb5da43d9d642e77a4c27eea2
* Since a RUN -> INIT/SCAN transition seems to immediately destroy the
ieee80211_node for the AP, we can't read the in_assoc value from there.
Instead just directly pass that information via a boolean_t argument.
* Adds iwm_mvm_rm_sta_id() function, which just unconditionally removes
the station from the firmware.
* The iwm_mvm_rm_sta() function shouldn't actually remove the station from
firmware when we are still associated (i.e. during a RUN -> INIT/SCAN
transition).
* So when disassociating we will first call iwm_mvm_rm_sta() to drain the
queues/fifos. Later during disassociation we will then use
iwm_mvm_rm_sta_id() to actually remove the station.
Inspired-By: Linux iwlwifi
Obtained from: dragonflybsd.git 81b3c1fe9122fa22f33d97103039cc375f656231
* Add a per-vap ps_disabled flag, and use it for a workaround which fixes
an association issue when powersaving is enabled.
* Compute flag that should correpsond to the mvmif->bss_conf.ps flag in
Linux's iwlwifi (e.g. this disallows powersaving when not associated
yet).
Inspired-By: Linux iwlwifi
Obtained from: dragonflybsd.git dc2e69bdfe8c9d7049c8a28da0adffbfbc6de5c0
* Power management handling is per-vap, not per-node, so we should pass
the iwm_vap in these arguments.
Obtained from: dragonflybsd.git 62a4e7957a736b4de38938b02fa7eb9b45bc5d0d
* Otherwise we would never update powersaving settings until we complete
an association, after the first authentication attempt.
* This corresponds to what Linux iwlwifi seems to do.
Obtained from: dragonflybsd.git aa128dc02a17c2e616232ef0fa997121e969c995
* Tear down the relevant firmware state (i.e. the station, the vif binding)
in these transition cases.
* Before this case would leave the firmware state lying around, resulting
in errors and firmware panics in the subsequent association attempts.
Obtained from: dragonflybsd.git 94b501399fde6368ae388a669c95b099a6e66e93
* This adds iwm_mvm_rm_sta(), which will be used to tear down firmware
state for better/cleaner iwm_newstate() handling.
* Makes iwm_enable_txq() and iwm_mvm_flush_tx_path() non-static, add
the declarations to if_iwm_util.h for now.
Obtained from: dragonflybsd.git 85d1c6190c4c3564b1a347f253e823aa95c202b2
* Hence no need to keep stuff in separate iwm_assoc() function, just
inline the stuff into iwm_newstate().
Obtained from: dragonflybsd.git e8f7d88e0d030f138f95ecdb7c1a729d9fb0d6ab
* Inspired by iwn(4) and Linux iwlwifi.
* Read wme parameters into a buffer within struct iwm_vap in
iwm_wme_update().
* If we aren't associated yet, the new settings will soon be sent
by iwm_mvm_mac_ctxt_changed() during association.
* If we are already associated, explicitly call iwm_mvm_mac_ctxt_changed()
from iwm_wme_update() to send the new settings to the firmware.
* Change iwm_mvm_ac_to_tx_fifo mapping, to fit the freebsd net80211
WME stream class numbering, instead of Linux's enum ieee80211_ac_numbers.
Obtained from: dragonflybsd.git b8bd6cd746d1f45e616ccfcbeed06dfe452a1108
* Factor out iwm_handle_rxb() function from iwm_notif_intr().
* Removing the IWM_FH_RCSR_CHNL0_RX_CONFIG_SINGLE_FRAME_MSK flag allows
the device to put multiple frames (both command responses and 80211
frames) into a single RX buffer.
* Uses m_copym() to split up the receive buffers when multiple 80211
frames are received in one RX buffer. The effect is basically the same
as when using m_split(), but we want to keep the original mbuf around
when calling iwm_mvm_rx_rx_mpdu() to make error handling a bit easier
for now.
* Contains a small optimization to avoid the m_copym() when only a single
80211 frame is received in one RX buffer (i.e. matching the existing
behaviour).
Obtained from: dragonflybsd.git b5eb43f0280bbcfd26af51cf5a4b8e8ff3590b67
* Fixes oversight from commit 757eecf0e6c92745aa2eee95811e573c8300850e.
fw_has_api now uses the isset macro instead of a simple logical-and.
Obtained from: dragonflybsd.git c00575de8491dc402abf52c8c7e1cca1ef79e257