This patch currently supports:
- ibcore as a kernel module only
- krping as a kernel module only
- ipoib as a kernel module only
Sponsored by: Mellanox Technologies
In particular:
- Don't evaluate event conditions with a sleepqueue lock held, since such
code may attempt to acquire arbitrary locks.
- Fix the return value for wait_event_interruptible() in the case that the
wait is interrupted by a signal.
- Implement wait_on_bit_timeout() and wait_on_atomic_t().
- Implement some functions used to test for pending signals.
- Implement a number of wait_event_*() variants and unify the existing
implementations.
- Unify the mechanism used by wait_event_*() and schedule() to put the
calling thread to sleep.
This is required to support updated DRM drivers. Thanks to hselasky for
finding and fixing a number of bugs in the original revision.
Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10986
Its purpose was to translate the values for msdosfs inode numbers,
which is calculated from the msdosfs structures describing the file,
into the range representable by 32bit ino_t. The translation acted
for filesystems larger than 128Gb, it reserved the range 0xf0000000
(FILENO_FIRST_DYN) to UINT32_MAX and remembered some arbitrary
translation of ino >= FILENO_FIRST_DYN into this range. It consumed
memory that could be only freed by unmount, and the translation was
not stable across remounts.
With ino_t type extended to 64 bit, there is no such issue and values
can be returned without compaction to 32bit. That is, for the native
environments, the translation layer is not necessary and adds
significant undeserved code complexity. For compat ABIs which use
32bit ino_t, the vfs.ino64_trunc_error sysctl provides some measures
to soften the failure mode when inode numbers truncation is not safe.
Discussed with: bde
Sponsored by: The FreeBSD Foundation
* This change also fixes a possible issue in the existing smart-fifo code,
which set the IWM_SF_CFG_DUMMY_NOTIF_OFF bit on AC8260 chipsets, although
that's only used in iwlwifi for Family 8000 chipsets connected via SDIO
interface.
Obtained from: Dragonflybsd.git cb650b01526b0aeef3c4307d926e7f1428997d50
This is closely tied to the Extended Attribute implementation.
Submitted by: Fedor Uporov
Reviewed by: kevlo, pfg
Differential Revision: https://reviews.freebsd.org/D10807
were copied to the buffer supplied by the user.
Also fix getrandom() if Linuxulator modules are built without the kernel.
PR: 219464
Submitted by: Maciej Pasternacki
Reported by: Maciej Pasternacki
MFC after: 1 week
In the deep past, when this code compiled as a binary module, ath_hal
built as a module. This allowed custom, smaller HAL modules to be built.
This was especially beneficial for small embedded platforms where you
didn't require /everything/ just to run.
However, sometime around the HAL opening fanfare, the HAL landed here
as one big driver+HAL thing, and a lot of the (dirty) infrastructure
(ie, #ifdef AH_SUPPORT_XXX) to build specific subsets of the HAL went away.
This was retained in sys/conf/files as "ath_hal_XXX" but it wasn't
really floated up to the modules themselves.
I'm now in a position where for the reaaaaaly embedded boards (both the
really old and the last couple generation of QCA MIPS boards) having a
cut down HAL module and driver loaded at runtime is /actually/ beneficial.
This reduces the kernel size down by quite a bit. The MIPS modules look
like this:
adrian@gertrude:~/work/freebsd/head-embedded/src % ls -l ../root/mips_ap/boot/kernel.CARAMBOLA2/ath*ko
-r-xr-xr-x 1 adrian adrian 5076 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_dfs.ko
-r-xr-xr-x 1 adrian adrian 100588 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_hal.ko
-r-xr-xr-x 1 adrian adrian 627324 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_hal_ar9300.ko
-r-xr-xr-x 1 adrian adrian 314588 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_main.ko
-r-xr-xr-x 1 adrian adrian 23472 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_rate.ko
And the x86 versions, like this:
root@gertrude:/home/adrian # ls -l /boot/kernel/ath*ko
-r-xr-xr-x 1 root wheel 36632 May 24 18:32 /boot/kernel/ath_dfs.ko
-r-xr-xr-x 1 root wheel 134440 May 24 18:32 /boot/kernel/ath_hal.ko
-r-xr-xr-x 1 root wheel 82320 May 24 18:32 /boot/kernel/ath_hal_ar5210.ko
-r-xr-xr-x 1 root wheel 104976 May 24 18:32 /boot/kernel/ath_hal_ar5211.ko
-r-xr-xr-x 1 root wheel 236144 May 24 18:32 /boot/kernel/ath_hal_ar5212.ko
-r-xr-xr-x 1 root wheel 336104 May 24 18:32 /boot/kernel/ath_hal_ar5416.ko
-r-xr-xr-x 1 root wheel 598336 May 24 18:32 /boot/kernel/ath_hal_ar9300.ko
-r-xr-xr-x 1 root wheel 406144 May 24 18:32 /boot/kernel/ath_main.ko
-r-xr-xr-x 1 root wheel 55352 May 24 18:32 /boot/kernel/ath_rate.ko
.. so you can see, not building the whole HAL can save quite a bit.
For example, if you don't need AR9300 support, you can actually avoid
wasting half a megabyte of RAM. On embedded routers this is quite a
big deal.
The AR9300 HAL can be later further shrunk because, hilariously,
it indeed supports AH_SUPPORT_<xxx> for optionally adding chipset support.
(I'll chase that down later as it's quite a big savings if you're only
building for a single embedded target.)
So:
* Create a very hackish way to load/unload HAL modules
* Create module metadata for each HAL subtype - ah_osdep_arXXXX.c
* Create module metadata for ath_rate and ath_dfs (bluetooth is
currently just built as part of it)
* .. yes, this means we could actually build multiple rate control
modules and pick one at load time, but I'd rather just glue this
into net80211's rate control code. Oh well, baby steps.
* Main driver is now "ath_main"
* Create an "if_ath" module that does what the ye olde one did -
load PCI glue, main driver, HAL and all child modules.
In this way, if you have "if_ath_load=YES" in /boot/modules.conf
it will load everything the old way and stuff should still work.
* For module autoloading purposes, I actually /did/ fix up
the name of the modules in if_ath_pci and if_ath_ahb.
If you want to selectively load things (eg on ye cheape ARM/MIPS platforms
where RAM is at a premium) you should:
* load ath_hal
* load the chip modules in question
* load ath_rate, ath_dfs
* load ath_main
* load if_ath_pci and/or if_ath_ahb depending upon your particular
bus bind type - this is where probe/attach is done.
TODO:
* AR5312 module and associated pieces - yes, we have the SoC side support
now so the wifi support would be good to "round things out";
* Just nuke AH_SUPPORT_AR5416 for now and always bloat the packet
structures; this'll simplify other things.
* Should add a simple refcnt thing to the HAL RF/chip modules so you
can't unload them whilst you're using them.
* Manpage updates, UPDATING if appropriate, etc.
The latest firmware has a number of link related fixes, support for a
new custom card, and the fix for a bug that affected rate limiting on
FreeBSD.
Obtained from: Chelsio Communications
MFC after: 1 week
Sponsored by: Chelsio Communications
ENA is a networking interface designed to make good use of modern CPU
features and system architectures.
The ENA device exposes a lightweight management interface with a
minimal set of memory mapped registers and extendable command set
through an Admin Queue.
The driver supports a range of ENA devices, is link-speed independent
(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
a negotiated and extendable feature set.
Some ENA devices support SR-IOV. This driver is used for both the
SR-IOV Physical Function (PF) and Virtual Function (VF) devices.
ENA devices enable high speed and low overhead network traffic
processing by providing multiple Tx/Rx queue pairs (the maximum number
is advertised by the device via the Admin Queue), a dedicated MSI-X
interrupt vector per Tx/Rx queue pair, and CPU cacheline optimized
data placement.
The ENA driver supports industry standard TCP/IP offload features such
as checksum offload and TCP transmit segmentation offload (TSO).
Receive-side scaling (RSS) is supported for multi-core scaling.
The ENA driver and its corresponding devices implement health
monitoring mechanisms such as watchdog, enabling the device and driver
to recover in a manner transparent to the application, as well as
debug logs.
Some of the ENA devices support a working mode called Low-latency
Queue (LLQ), which saves several more microseconds. This feature will
be implemented for driver in future releases.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Jakub Palider <jpa@semihalf.com>
Jan Medala <jan@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon.com Inc.
Differential revision: https://reviews.freebsd.org/D10427
This will help Jenkins dedupe 9 warnings between the static build and
the module build of ipsec(4).
Missed in SRCTOP conversion in r314651.
MFC with: r314651
Sponsored by: Dell EMC Isilon
The ccr(4) driver supports use of the crypto accelerator engine on
Chelsio T6 NICs in "lookaside" mode via the opencrypto framework.
Currently, the driver supports AES-CBC, AES-CTR, AES-GCM, and AES-XTS
cipher algorithms as well as the SHA1-HMAC, SHA2-256-HMAC, SHA2-384-HMAC,
and SHA2-512-HMAC authentication algorithms. The driver also supports
chaining one of AES-CBC, AES-CTR, or AES-XTS with an authentication
algorithm for encrypt-then-authenticate operations.
Note that this driver is still under active development and testing and
may not yet be ready for production use. It does pass the tests in
tests/sys/opencrypto with the exception that the AES-GCM implementation
in the driver does not yet support requests with a zero byte payload.
To use this driver currently, the "uwire" configuration must be used
along with explicitly enabling support for lookaside crypto capabilities
in the cxgbe(4) driver. These can be done by setting the following
tunables before loading the cxgbe(4) driver:
hw.cxgbe.config_file=uwire
hw.cxgbe.cryptocaps_allowed=-1
MFC after: 1 month
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D10763
2. Use sysctls for TRACE_LRO_CNT and TRACE_TSO_PKT_LEN
3. remove unused mtx tx_lock
4. bind taskqueue kernel thread to the appropriate cpu core
5. when tx_ring is full, stop further transmits till at least 1/16th of the Tx Ring is empty. In our case 1K entries. Also if there are rx_pkts to process, put the taskqueue thread to sleep for 100ms, before enabling interrupts.
6. Use rx_pkt_threshold of 128.
MFC after:3 days
* This adds iwm_mvm_rm_sta(), which will be used to tear down firmware
state for better/cleaner iwm_newstate() handling.
* Makes iwm_enable_txq() and iwm_mvm_flush_tx_path() non-static, add
the declarations to if_iwm_util.h for now.
Obtained from: dragonflybsd.git 85d1c6190c4c3564b1a347f253e823aa95c202b2
For GEN1 Hyper-V, vmbus is attached to pcib0, which contains the
resources for PCI passthrough and SR-IOV. There is no
acpi_syscontainer0 on GEN1 Hyper-V.
For GEN2 Hyper-V, vmbus is attached to acpi_syscontainer0, which
contains the resources for PCI passthrough and SR-IOV. There is
no pcib0 on GEN2 Hyper-V.
The ACPI VMBUS device now only holds its _CRS, which is empty as
of this commit; its existence is mainly for upward compatibility.
Device tree structure is suggested by jhb@.
Tested-by: dexuan@
Collabrated-wth: dexuan@
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D10565
the TOE. For now this capability is always enabled in kernels with
options RATELIMIT. t4_tom will check if_capenable once the base driver
gets code to support rate limiting for any socket (TOE or not).
This was tested with iperf3 and netperf ToT as they already support
SO_MAX_PACING_RATE sockopt. There is a bug in firmwares prior to
1.16.45.0 that affects the BSD driver only and results in rate-limiting
at an incorrect rate. This will resolve by itself as soon as 1.16.45.0
or later firmware shows up in the driver.
Relnotes: Yes
Sponsored by: Chelsio Communications
- Create a new file, t4_sched.c, and move all of the code related to
traffic management from t4_main.c and t4_sge.c to this file.
- Track both Channel Rate Limiter (ch_rl) and Class Rate Limiter (cl_rl)
parameters in the PF driver.
- Initialize all the cl_rl limiters with somewhat arbitrary default
rates and provide routines to update them on the fly.
- Provide routines to reserve and release traffic classes.
MFC after: 1 month
Sponsored by: Chelsio Communications
patm(4) devices.
Maintaining an address family and framework has real costs when we make
infrastructure improvements. In the case of NATM we support no devices
manufactured in the last 20 years and some will not even work in modern
motherboards (some newer devices that patm(4) could be updated to
support apparently exist, but we do not currently have support).
With this change, support remains for some netgraph modules that don't
require NATM support code. It is unclear if all these should remain,
though ng_atmllc certainly stands alone.
Note well: FreeBSD 11 supports NATM and will continue to do so until at
least September 30, 2021. Improvements to the code in FreeBSD 11 are
certainly welcome.
Reviewed by: philip
Approved by: harti
FDC_DEBUG is not referenced in any c or header files but traces of it
still remain in other files.
PR: 105608
Reported by: Eugene Grosbein <ports AT grosbein DOT net>
Reviewed by: imp
Approved by: bcr (mentor)
MFC after: 7 days
Differential Revision: https://reviews.freebsd.org/D10303
- Move all bitmap related functions from bitops.h to bitmap.h, similar
to what Linux does.
- Apply some minor code cleanup and simplifications to optimize the
generated code when using static inline functions.
- Implement the following list of bitmap functions which are needed by
drm-next and ibcore:
- bitmap_find_next_zero_area_off()
- bitmap_find_next_zero_area()
- bitmap_or()
- bitmap_and()
- bitmap_xor()
- Add missing include directives to the qlnxe driver
(davidcs@ has been notified)
MFC after: 1 week
Sponsored by: Mellanox Technologies
DTrace includes assym.s, to build this we build assym.o, however
this is unneeded as assym.s only contains macros.
Remove the need to build this by removing it from OBJS, but keep assym.s
in the module dependencies via DPSRCS.
This fixes the build when there is no assembler, e.g. on arm64 without
the external binutils.
Submitted by: andrew
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10041
The module is designed for modification of a packets of any protocols.
For now it implements only TCP MSS modification. It adds the external
action handler for "tcp-setmss" action.
A rule with tcp-setmss action does additional check for protocol and
TCP flags. If SYN flag is present, it parses TCP options and modifies
MSS option if its value is greater than configured value in the rule.
Then it adjustes TCP checksum if needed. After handling the search
continues with the next rule.
Obtained from: Yandex LLC
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Yandex LLC
No objection from: #network
Differential Revision: https://reviews.freebsd.org/D10150
The goal of this work is to remove the explicit dependency for ctl(4)
on iscsi(4), so end-users without iscsi(4) support in the kernel can
use ctl(4) for its other functions.
This allows those without iscsi(4) support built into the kernel to use
ctl(4) as a test mechanism. As a sidenote, this was possible around the
10.0-RELEASE period, but made impossible for end-users without iscsi(4)
between 10.0-RELEASE and 11.0-RELEASE.
Automatically load cfiscsi(4) from ctladm(8) and ctld(8) for backwards
compatibility with previously releases. The automatic loading feature is
compiled into the beforementioned tools if MK_ISCSI == yes when building
world.
Add a manpage for cfiscsi(4) and refer to it in ctl(4).
Differential Revision: D10099
MFC after: 2 months
Relnotes: yes
Reviewed by: mav, trasz
Sponsored by: Dell EMC Isilon
instrument security event auditing rather than relying on conventional BSM
trail files or audit pipes:
- Add a set of per-event 'commit' probes, which provide access to
particular auditable events at the time of commit in system-call return.
These probes gain access to audit data via the in-kernel audit_record
data structure, providing convenient access to system-call arguments and
return values in a single probe.
- Add a set of per-event 'bsm' probes, which provide access to particular
auditable events at the time of BSM record generation in the audit
worker thread. These probes have access to the in-kernel audit_record
data structure and BSM representation as would be written to a trail
file or audit pipe -- i.e., asynchronously in the audit worker thread.
DTrace probe arguments consist of the name of the audit event (to support
future mechanisms of instrumenting multiple events via a single probe --
e.g., using classes), a pointer to the in-kernel audit record, and an
optional pointer to the BSM data and its length. For human convenience,
upper-case audit event names (AUE_...) are converted to lower case in
DTrace.
DTrace scripts can now cause additional audit-based data to be collected
on system calls, and inspect internal and BSM representations of the data.
They do not affect data captured in the audit trail or audit pipes
configured in the system. auditd(8) must be configured and running in
order to provide a database of event information, as well as other audit
configuration parameters (e.g., to capture command-line arguments or
environmental variables) for the provider to operate.
Reviewed by: gnn, jonathan, markj
Sponsored by: DARPA, AFRL
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D10149
the default partition, eMMC v4.41 and later devices can additionally
provide up to:
1 enhanced user data area partition
2 boot partitions
1 RPMB (Replay Protected Memory Block) partition
4 general purpose partitions (optionally with a enhanced or extended
attribute)
Of these "partitions", only the enhanced user data area one actually
slices the user data area partition and, thus, gets handled with the
help of geom_flashmap(4). The other types of partitions have address
space independent from the default partition and need to be switched
to via CMD6 (SWITCH), i. e. constitute a set of additional "disks".
The second kind of these "partitions" doesn't fit that well into the
design of mmc(4) and mmcsd(4). I've decided to let mmcsd(4) hook all
of these "partitions" up as disk(9)'s (except for the RPMB partition
as it didn't seem to make much sense to be able to put a file-system
there and may require authentication; therefore, RPMB partitions are
solely accessible via the newly added IOCTL interface currently; see
also below). This approach for one resulted in cleaner code. Second,
it retains the notion of mmcsd(4) children corresponding to a single
physical device each. With the addition of some layering violations,
it also would have been possible for mmc(4) to add separate mmcsd(4)
instances with one disk each for all of these "partitions", however.
Still, both mmc(4) and mmcsd(4) share some common code now e. g. for
issuing CMD6, which has been factored out into mmc_subr.c.
Besides simply subdividing eMMC devices, some Intel NUCs having UEFI
code in the boot partitions etc., another use case for the partition
support is the activation of pseudo-SLC mode, which manufacturers of
eMMC chips typically associate with the enhanced user data area and/
or the enhanced attribute of general purpose partitions.
CAVEAT EMPTOR: Partitioning eMMC devices is a one-time operation.
- Now that properly issuing CMD6 is crucial (so data isn't written to
the wrong partition for example), make a step into the direction of
correctly handling the timeout for these commands in the MMC layer.
Also, do a SEND_STATUS when CMD6 is invoked with an R1B response as
recommended by relevant specifications. However, quite some work is
left to be done in this regard; all other R1B-type commands done by
the MMC layer also should be followed by a SEND_STATUS (CMD13), the
erase timeout calculations/handling as documented in specifications
are entirely ignored so far, the MMC layer doesn't provide timeouts
applicable up to the bridge drivers and at least sdhci(4) currently
is hardcoding 1 s as timeout for all command types unconditionally.
Let alone already available return codes often not being checked in
the MMC layer ...
- Add an IOCTL interface to mmcsd(4); this is sufficiently compatible
with Linux so that the GNU mmc-utils can be ported to and used with
FreeBSD (note that due to the remaining deficiencies outlined above
SANITIZE operations issued by/with `mmc` currently most likely will
fail). These latter will be added to ports as sysutils/mmc-utils in
a bit. Among others, the `mmc` tool of the GNU mmc-utils allows for
partitioning eMMC devices (tested working).
- For devices following the eMMC specification v4.41 or later, year 0
is 2013 rather than 1997; so correct this for assembling the device
ID string properly.
- Let mmcsd.ko depend on mmc.ko. Additionally, bump MMC_VERSION as at
least for some of the above a matching pair is required.
- In the ACPI front-end of sdhci(4) describe the Intel eMMC and SDXC
controllers as such in order to match the PCI one.
Additionally, in the entry for the 80860F14 SDXC controller remove
the eMMC-only SDHCI_QUIRK_INTEL_POWER_UP_RESET.
OKed by: imp
Submitted by: ian (mmc_switch_status() implementation)
They cannot be used anymore with the userland bits we provide.
Furthermore, their KMS versions support the same hardware.
Submitted by: dumbbell
Reviewed by: emaste, manu
Sponsored by: AsiaBSDCon
Differential Revision: https://reviews.freebsd.org/D5614
When locking a mutex and deadlock is detected the first mutex lock
call that sees the deadlock will return -EDEADLK .
MFC after: 1 week
Sponsored by: Mellanox Technologies
spigen provides userland API to SPI bus. Make it available as a loadable
module so people using official ARM images can enabled it on devices like
BBB or RPi without re-building kernel
MFC after: 1 week
Put large functions into linux_slab.c instead of declaring them static
inline.
Add support for more memory allocation wrappers like kmalloc_array()
and __vmalloc().
Make sure either the M_WAITOK or the M_NOWAIT flag is set and mask
away unused memory allocation flags before calling FreeBSD's malloc()
routine.
Move kmalloc_node() definition to slab.h where it belongs.
Implement support for the SLAB_DESTROY_BY_RCU feature when creating a
kmem_cache which basically means kmem_cache memory is freed using
call_rcu().
MFC after: 1 week
Sponsored by: Mellanox Technologies
The module uses unnamed structure and union fields and base GCC in
stable/10 doesn't like it.
I think that that is a C11 feature, so it is courteous of more modern
compilers to not complain about it when compiling in C99 mode.
Approved by: davidcs
MFC after: 5 days
This change makes the workqueue implementation behave more like in
Linux, both functionality wise and structure wise.
All workqueue code has been moved to linux_work.c
Add an atomic based statemachine to the work_struct to ensure proper
operation. Prior to this change struct_work was directly mapped to a
FreeBSD task. When a taskqueue has multiple threads the same task may
end up being executed on more than one worker thread simultaneously.
This might cause problems with code coming from Linux, which expects
serial behaviour, similar to Linux tasklets.
Move all global workqueue function names into the linux_xxx domain to
avoid symbol name clashes in the future.
Implement a few more workqueue related functions and macros.
Create two multithreaded taskqueues for the LinuxKPI during module
load, one for time-consuming callbacks and one for non-time consuming
callbacks.
MFC after: 1 week
Sponsored by: Mellanox Technologies