CAM_DEBUG_TRACE results in way too much debug output than needed now.
When debugging, it's always possible to turn on trace level using camcontrol.
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D12110
by Matt Macy as well as other changes which he has accepted via pull
request to his github repo at https://github.com/mattmacy/networking/
This should bring -CURRENT and the github repo into close enough sync to
allow small feature branches rather than a large chain of interdependant
patches being developed out of tree. The reset of the synchronization
should be able to be completed on github by splitting the remaining
changes that are not yet ready into short feature branches for later
review as smaller commits.
Here is a summary of changes included in this patch:
1) More checks when INVARIANTS are enabled for eariler problem
detection
2) Group Task Queue cleanups
- Fix use of duplicate shortdesc for gtaskqueue malloc type.
Some interfaces such as memguard(9) use the short description to
identify malloc types, so duplicates should be avoided.
3) Allow gtaskqueues to use ithreads in addition to taskqueues
- In some cases, this can improve performance
4) Better logging when taskqgroup_attach*() fails to set interrupt
affinity.
5) Do not start gtaskqueues until they're needed
6) Have mp_ring enqueue function enter the ABDICATED rather than BUSY
state. This moves the TX to the gtaskq and allows processing to
continue faster as well as make TX batching more likely.
7) Add an ift_txd_errata function to struct if_txrx. This allows
drivers to inspect/modify mbufs before transmission.
8) Add a new IFLIB_NEED_ZERO_CSUM for drivers to indicate they need
checksums zeroed for checksum offload to work. This avoids modifying
packet data in the TX path when possible.
9) Use ithreads for iflib I/O instead of taskqueues
10) Clean up ioctl and support async ioctl functions
11) Prefetch two cachlines from each mbuf instead of one up to 128B. We
often need to parse packet header info beyond 64B.
12) Fix potential memory corruption due to fence post error in
bit_nclear() usage.
13) Improved hang detection and handling
14) If the packet is smaller than MTU, disable the TSO flags.
This avoids extra packet parsing when not needed.
15) Move TCP header parsing inside the IS_TSO?() test.
This avoids extra packet parsing when not needed.
16) Pass chains of mbufs that are not consumed by lro to if_input()
rather call if_input() for each mbuf.
17) Re-arrange packet header loads to get as much work as possible done
before a cache stall.
18) Lock the context when calling IFDI_ATTACH_PRE()/IFDI_ATTACH_POST()/
IFDI_DETACH();
19) Attempt to distribute RX/TX tasks across cores more sensibly,
especially when RX and TX share an interrupt. RX will attempt to
take the first threads on a core, and TX will attempt to take
successive threads.
20) Allow iflib_softirq_alloc_generic() to request affinity to the same
cpus an interrupt has affinity with. This allows TX queues to
ensure they are serviced by the socket the device is on.
21) Add new iflib sysctls to net.iflib:
- timer_int - interval at which to run per-queue timers in ticks
- force_busdma
22) Add new per-device iflib sysctls to dev.X.Y.iflib
- rx_budget allows tuning the batch size on the RX path
- watchdog_events Count of watchdog events seen since load
23) Fix error where netmap_rxq_init() could get called before
IFDI_INIT()
24) e1000: Fixed version of r323008: post-cold sleep instead of DELAY
when waiting for firmware
- After interrupts are enabled, convert all waits to sleeps
- Eliminates e1000 software/firmware synchronization busy waits after
startup
25) e1000: Remove special case for budget=1 in em_txrx.c
- Premature optimization which may actually be incorrect with
multi-segment packets
26) e1000: Split out TX interrupt rather than share an interrupt for
RX and TX.
- Allows better performance by keeping RX and TX paths separate
27) e1000: Separate igb from em code where suitable
Much easier to understand separate functions and "if (is_igb)" than
previous tests like "if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC))"
#blamebruno
Reviewed by: sbruno
Approved by: sbruno (mentor)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D12235
Normally after receiving a packet, a vlan(4) interface sends the packet
back through its parent interface's rx routine so that it can be
processed as an untagged frame. It does this by using the parent's
ifp->if_input. This is incompatible with netmap(4), which replaces the
vlan(4) interface's if_input with a netmap(4) hook. Fix this by using
the vlan(4) interface's ifp instead of the parent's directly.
Reported by: Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by: rstone
Approved by: rstone (mentor)
MFC after: 3 days
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12191
The cam_iosched_ticker() can't be scheduled more than once per tick.
Some limiters depend on quanta matching the number of calls per second
to enforce the proper limits. Limit the quanta to no faster than 1 per
clock tick. This fixes some features when running in VMs where the
default HZ is 100.
PR: 221953
Obtained from: ElectroBSD
Differential Revision: https://reviews.freebsd.org/D12337
Submitted by: Fabian Keil
It's awkward to have spaces in CAM device serial numbers. That leads to
such things as device nodes named "/dev/diskid/MYSERIAL%20%20%201". Better
to replace the spaces with "0"s. This change only affects the default
serial numbers for users who don't provide their own.
Reviewed by: ken, mav
MFC after: Never
Relnotes: Yes
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D12263
Newer binutils supports extensions to the MIPS ABI for non-PIC code
that is used when compiling O32 binaries with clang 5 (but not used
for N64 oddly enough). These extensions require support for
R_MIPS_COPY relocations as well as a second PLT GOT using
R_MIPS_JUMP_SLOT relocations.
For R_MIPS_COPY, use the same approach as on other architectures where
fixups are deferred to the MD do_copy_relocations.
The additional PLT GOT for jump slots is located in a .got.plt section
which is identified by a DT_MIPS_PLTGOT dynamic entry. This GOT also
requires fixups for the first two GOT entries just as the normal GOT.
However, the entry point for this second GOT uses a different calling
convention. Rather than passing an offset into the GOT, it passes an
offset into the .rel.plt section. This requires a second entry point
(_rtld_pltbind_start) which calls the normal _rtld_bind() rather than
_mips_rtld_bind(). This also means providing a real version of
reloc_jmpslot() which is used by _rtld_bind().
In addition, add real implementions of reloc_plt() and
reloc_jmpslots() which walk .rel.plt handling R_MIPS_JUMP_SLOT
relocations.
Reviewed by: kib
Sponsored by: DARPA / AFRL
Differential Revision: https://reviews.freebsd.org/D12326
The zfsonlinux feature large_dnode is not yet supported by the loader.
Reviewed by: avg, allanjude
Differential Revision: https://reviews.freebsd.org/D12288
"probe" method of those drivers to mean we're on e TI SoC. Introduce a new
function, ti_soc_is_supported(), and use it to be sure we're really a TI
system.
PR: 222250
The uncovered vnode is possible because there is no guarantee that
its hold count would go to zero (and it would be inactivated and reclaimed)
immediately after a covering filesystem is unmounted.
So, such a vnode should be expected and it is possible to re-use it
without any trouble.
MFC after: 3 weeks
Sponsored by: Panzura
The only consumer of zfs_get_vfs, zfs_unmount_snap, does not need
the filesystem to be busy, it just need a reference that it can pass
to dounmount.
Also, previously the code was racy as it unbusied the filesystem
before taking a reference on it.
Now the code should be simpler and safer.
MFC after: 2 weeks
Sponsored by: Panzura
stop, read, and write methods. Some controllers don't implement these
individual operations and have only a transfer method. In that case, we
should return an indication that the device is present but doesn't support
the method, as opposed to the kobj default error ENXIO which makes it
look like the whole device is missing. Userland tools such as i2c(8) can
use the differing return values to switch between the two different i2c
IO mechanisms.
On AMD, the MCG_CAP feature bit is reserved -- not explicitly zero. Do not
use it to determine CMCI support.
Reviewed by: avg, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12320
This Makefile relies on Makefile.fat providing the correct value for
BOOT1_MAXSIZE and BOOT1_OFFSET. Since BOOT1_OFFSET had no default value
here the build would already fail if Makefile.fat did not provide
correct values.
Sponsored by: The FreeBSD Foundation
illumos/illumos-gate@37e84ab74e37e84ab74ehttps://www.illumos.org/issues/8569
C [C99] has peculiar rules for inline functions that are different from the
C++ rules. Unlike C++ where inline is "fire and forget", in C a programmer
must pay attention to the function's storage class / visibility. The main
problem is with the case where a compiler decides to not inline a call to the
function declared as inline.
Some relevant links:
- http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15831.html
- http://www.drdobbs.com/the-new-c-inline-functions/184401540
The summary is that either the inline functions should be declared 'static
inline' or one of the compilation units (.c files) must provide a callable
externally visible function definition. In the former case, the compiler would
automatically create a local non-inlined function instance in every compilation
unit where it's needed. In the latter case the single external definition is
used to satisfy any non-inlined calls in all compilation units. As things
stand right now, we can get an undefined reference error under certain
combinations of compilers and compiler options. For example, this is what I
get on FreeBSD when compiling with clang 4.0.0 and -O1:
In function `abd_free': /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:385:
undefined reference to `abd_is_linear'
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>
MFC after: 1 week
illumos/illumos-gate@216d7723a1216d7723a1https://www.illumos.org/issues/8558
On a system with more than 80K ZFS filesystems, we've seen cases where
lwp_create() will start to fail by returning EAGAIN. The problem being,
for each of those 80K ZFS filesystems, a taskq will be created for each
dataset as part of the ZIL for each dataset.
For each of these taskq's, a kernel thread will be created which results
in 24KB being allocated for each thread. With enough of these 24KB
allocations, we eventually exhaust the memory region set aside for these
allocations. Currently, segkpsize is set to a value of 2GB, which means
we can only support about 80K filesystems; 2GB / 24KB = ~80K.
The lwp_create() failure comes into play due to the fact that LWP
creation also allocates 24KB from this same region of memory. Thus, if
we've exhausted this region of memory due to the number of ZIL taskq's,
there won't be any memory avaible to allow the call to lwp_create() to
succeed.
FreeBSD note: I haven't created sysctl-s for the new ZIL clean
parameters. Let's add them if anyone requires to tune them.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>
MFC after: 3 weeks
This patch adds hwtype parameter which keeps information about hardware
revision of Marvell EHCI controller. It allows to replace multiple
calls to ofw_bus_is_compatible with comparing hwtype value during driver
initialization.
Submitted by: Patryk Duda <pdk@semihalf.com>
Suggested by: ian
Obtained from: Semihalf
Sponsored by: Semihalf
In advance of other changes to the fat template generation process, have
generate-fat.sh create all template files at the same time so that they
cannot get out of sync.
Also correct a longstanding but where BOOT1_OFFSET was overwritten on
each invocation. A previous version of this patch stored a per-arch
offset (e.g. BOOT1_arm64_OFFSET) but that was deemed unnecessary.
Instead just hardcode the known offset that applies to all archs (0x2d)
and fail if the offset happens to be different.
Ongiong work (using newfs_msdos in bsdinstall and adding msdosfs support
to makefs) will eventually allow us to do away with this fat template
hack altogether, but in the near term we have a few improvements that
will build on this.
Reviewed by: allanjude, imp, Eric McCorkle
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D10931
In the case of running newvers.sh on a git tree w/o git-svn-id notes we
previously piped the entire 'git log' to grep. Add --grep to the log
invocation to avoid processing log entries of no interest.
This saves about 2-3 seconds of newvers.sh run time on my SSD laptop.
Later changes will bring further speedups.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
This prevents incorrect subversion revision detection when "git svn" is
not being used to get the sources but git is available. Previously old
subversion revisions included in commit messages were favoured over the
more recent and correct revisions in git notes.
For example cf1f355747 represents r315395 but was treated as r313908
which is referenced in the commit message. Commits following
r315395/cf1f35574722 but before another commit with a git-svn-id
reference in the commit message would be treated as r313908 as well.
Patch from PR updated to accommodate the initial four space indent in
`git log` ouptut.
PR: 221848
Submitted by: Fabian Keil
Obtained from: ElectroBSD
MFC after: 2 weeks
Prior to the change they were subject to extreme false sharing.
In particular this change shaves about 3 seconds real time of -j 80 buildkernel.
Reviewed by: alc, markj
Differential Revision: https://reviews.freebsd.org/D12281
Sometimes it is necessary to combine several gpio pins into an ad-hoc bus
and manipulate the pins as a group. In such cases manipulating the pins
individualy is not an option, because the value on the "bus" assumes
potentially-invalid intermediate values as each pin is changed in turn. Note
that the "bus" may be something as simple as a bi-color LED where changing
colors requires changing both gpio pins at once, or something as complex as
a bitbanged multiplexed address/data bus connected to a microcontroller.
In addition to the absolute requirement of simultaneously changing the
output values of driven pins, a desirable feature of these new methods is to
provide a higher-performance mechanism for reading and writing multiple
pins, especially from userland where pin-at-a-time access incurs a noticible
syscall time penalty.
These new interfaces are NOT intended to abstract away all the ugly details
of how gpio is implemented on any given platform. In fact, to use these
properly you absolutely must know something about how the gpio hardware is
organized. Typically there are "banks" of gpio pins controlled by registers
which group several pins together. A bank may be as small as 2 pins or as
big as "all the pins on the device, hundreds of them." In the latter case, a
driver might support this interface by allowing access to any 32 adjacent
pins within the overall collection. Or, more likely, any 32 adjacent pins
starting at any multiple of 32. Whatever the hardware restrictions may be,
you would need to understand them to use this interface.
In additional to defining the interfaces, two example implementations are
included here, for imx5/6, and allwinner. These represent the two primary
types of gpio hardware drivers. imx6 has multiple gpio devices, each
implementing a single bank of 32 pins. Allwinner implements a single large
gpio number space from 1-n pins, and the driver internally translates that
linear number space to a bank+pin scheme based on how the pins are grouped
into control registers. The allwinner implementation imposes the restriction
that the first_pin argument to the new functions must always be pin 0 of a
bank.
Differential Revision: https://reviews.freebsd.org/D11810
for analyzing the radix tree structures and reporting on the number, and
sizes, of maximal intervals of free blocks. The report includes the number
of maximal intervals, and also the number of them in each of several size
ranges, from small (size 1, or 3 to 4) to large (28657 to 46367) with size
boundaries defined by Fibonacci numbers. The report is written in the test
tool with the 's' command, or in a running kernel by sysctl.
The analysis of the radix tree frequently computes the position of the lone
bit set in a u_daddr_t, a computation that also appears in leaf allocation.
That computation has been moved into a function of its own, and optimized
for cases where an inlined machine instruction can replace the usual binary
search.
Submitted by: Doug Moore <dougm@rice.edu>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11906
it to a random value between 100 and 1123, rather than 0 as before.
Submitted by: Marie Helene Kvello-Aune <marieheleneka@gmail.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D5336
Since the efipart rewrite, the chain command was looking for device
handle using interface applicable only for net devices. Disk
partitions and zfs pools need their own approach to find the proper handle.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D12287
namecache_ts differs from mere namecache by few fields placed mid struct.
The access to the last element (the name) is thus special-cased.
The standard solution is to put new fields at the very beginning anad
embedd the original struct. The pointer shuffled around points to the
embedded part. If needed, access to new fields can be gained through
__containerof.
MFC after: 1 week
This module is specific to a single Marvel board that we currently
only support in 64-bit mode. Remove it from the build otherwise. It
likely should be completely removed, but this unbreaks x86 building.
Noticed by: sbruno@
the driver in a place where it will be built for all targets. x86 doesn't
have all the required build bits for this device.
Move the uart(4) device mvebu to arm64 only.
lock if both old and new pages use the same underlying lock. Convert
existing places to use the helper instead of inlining it. Use the
optimization in vm_object_page_remove().
Suggested and reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
have a scope ID. Change size of the searched scope ID to the full
16-bits. There can typically be more than 255 interfaces.
Suggested by: ae @
MFC after: 1 week
Sponsored by: Mellanox Technologies
This patch enables using NETA driver on Marvell Armada 3700 SoC
by introducing new compatible string, modifying clock source
obtaining and also excluding unnecessary parts.
The driver is added as a build option for arm64 platforms as well.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12258
Now the virtual address of received buffer is taken from a software ring.
Thanks to this, we can use the NETA driver on 64 bits architecture and
avoid 32-bit buf_cookie descriptor field limitation.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12257
This patch adds support for UART in Armada 3700 family.
It exposes both low-level UART interface, as well as
standard driver methods.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12250
Enabled driver can be used on boards equipped with Marvell Armada 3700 SoC.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12256
This patch reuses ehci_mv driver by adding a support for the new
compatible string and adding ehci_mv.c to list of available options
for arm64 platforms.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12255
illumos/illumos-gate@1702cce7511702cce751
FreeBSD note: rather than merging the zpool.8 update I copied the zpool
scrub section from the illumos zpool.1m to FreeBSD zpool.8 almost
verbatim. Now that the illumos page uses the mdoc format, it was an
easier option. Perhaps the change is not in perfect compliance with the
FreeBSD style, but I think that it is acceptible.
https://www.illumos.org/issues/8414
This issue tracks the port of scrub pause from ZoL: https://github.com/zfsonlinux/zfs/pull/6167
Currently, there is no way to pause a scrub. Pausing may be useful when
the pool is busy with other I/O to preserve bandwidth.
Description
This patch adds the ability to pause and resume scrubbing. This is achieved
by maintaining a persistent on-disk scrub state. While the state is 'paused'
we do not scrub any more blocks. We do however perform regular scan
housekeeping such as freeing async destroyed and deadlist blocks while paused.
Motivation and Context
Scrub pausing can be an I/O intensive operation and people have been asking
for the ability to pause a scrub for a while. This allows one to preserve scrub
progress while freeing up bandwidth for other I/O.
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Alek Pinchuk <apinchuk@datto.com>
MFC after: 2 weeks
Enabled driver can be used on boards equipped with Marvell Armada
3700/7k/8k SoCs.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12253
This driver will be used by Marvell Armada 3700 and 7k/8k SoC families.
The same, generic xhci device also appears in Armada 380, so we are reusing
driver.
This patch also adds xhci_mv.c entry to the arm64 files list.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12252
Workaround problem that ifa_ifwithaddr() also matches the scope ID of
the IPv6 address when searching for a maching IPv6 address. For now
simply try all valid scope IDs until a match is found.
MFC after: 1 week
Sponsored by: Mellanox Technologies
use of the linux_poll_wakeup() function from unsafe contexts, which
can lead to use-after-free issues.
Instead of calling linux_poll_wakeup() directly use the wake_up()
family of functions in the LinuxKPI to do this.
Bump the FreeBSD version to force recompilation of external kernel modules.
MFC after: 1 week
Sponsored by: Mellanox Technologies
- start_wrq_wr must not drain the wr_list if there are incomplete_wrs
pending. This can happen when a t4_wrq_tx runs between two
start_wrq_wr.
- commit_wrq_wr must examine the cookie's pidx and ndesc with the
queue's lock held. Otherwise there is a bad race when incomplete WRs
are being completed and commit_wrq_wr for the WR that is ahead in the
queue updates the next incomplete WR's cookie's pidx/ndesc but the
commit_wrq_wr for the second one is using stale values that it read
without the lock.
MFC after: 1 week
Sponsored by: Chelsio Communications
P5040/P5021 have the same number of LAWs as P5020. There may be a better way of
getting the count from the FDT (fsl,num-laws property on soc/corenet-law or
soc/ecm-law), but that's not supported everywhere, so we still need this check
for those other cases.
These processors may not be supported yet, but add them for completion.
POWER9 is planned for support. e300 may work (based on 603e core).
P5040/P5021 are similar to P5020, so should work as well. One addition is
needed for P5040, to support the number of LAWs, and will be a separate commit.
In integrity mode, a larger logical sector (e.g., 4096 bytes) spans several
physical sectors (e.g., 512 bytes) on the backing device. Due to hash
overhead, a 4096 byte logical sector takes 8.5625 512-byte physical sectors.
This means that only 288 bytes (256 data + 32 hash) of the last 512 byte
sector are used.
The memory allocation used to store the encrypted data to be written to the
physical sectors comes from malloc(9) and does not use M_ZERO.
Previously, nothing initialized the final physical sector backing each
logical sector, aside from the hash + encrypted data portion. So 224 bytes
of kernel heap memory was leaked to every block :-(.
This patch addresses the issue by initializing the trailing portion of the
physical sector in every logical sector to zeros before use. A much simpler
but higher overhead fix would be to tag the entire allocation M_ZERO.
PR: 222077
Reported by: Maxim Khitrov <max AT mxcrypt.com>
Reviewed by: emaste
Security: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12272
In particular this eliminates function calls and related register save/restore
when only few writes would suffice.
Example speed up can be seen in a fstat microbenchmark on AMD Ryzen cpus, where
the throughput went up by ~4.5%.
Thanks to cem@ for benchmarking and reviewing the patch.
MFC after: 1 week
Scan all buses for CSR bus, not stopping on the first failed
match. Scan all slots for function 0 on the found bus, for instance on
IvyBridge the slot 0 is not decoded at all. Since the scan is quite
unsafe, and access to the buses is mostly useful for developers,
enable the csr buses scan with the tunable.
Current qpi.c makes too many assumptions about the uncore
configuration buses location and about slots occupied. Also it
restricts itself only to Nehalem CPUs. It is needed on all Core-based
Xeons. On the 2600 v2 (IvyBridge) machine I have access to, the CSR
buses have numbers 31 (BSP socket) and 63 (second socket), and there
is no functions pci0.31.0.0 or pci0.63.0.0. According to the CPU
datasheet, all devices on the uncore bus occupy slots >= 8.
Practically, the attach to config buses is required for the intel-pcm
pcm-memory.x tool to work, for instance.
Reviewed by: jhb (previous version)
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D12268
remapping.
VT-d specification requires use of PCI rid as source id for IOAPICs
enumerated by PCI bus. The values from the DMAR ACPI table should be
only used when IOAPIC is not on PCI.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Hardware provided by: Intel
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D12205
the interrupt messages from given IOAPIC, if the IOAPIC can be
enumerated on PCI bus.
If IOAPIC has PCI binding, match the PCI device against MADT
enumerated IOAPIC. Match is done first by registers window physical
address, then by IOAPIC ID as read from the APIC ID register.
PCI bsf address of the matched PCI device is the rid.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Hardware provided by: Intel
MFC after: 2 weeks
X-Differential revision: https://reviews.freebsd.org/D12205
This provides port stats (updated once per second) in
dev.bnxt.X.port_stats for PFs. VFs do not have access to the port stats.
Submitted by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by: shurd, sbruno
Approved by: sbruno (mentor)
Sponsored by: Broadcom Limited
Differential Revision: https://reviews.freebsd.org/D11914
Do not use malloc(M_NOWAIT), wait is possible there, and the malloc
failures where not checked. Do not forget to free malloced memory.
Reported and tested by: pho
Approved by: sbruno
Sponsored by: The FreeBSD Foundation
We currently initialize the vm_page array in three passes: one to zero
the array, one to initialize the "order" field of each page (necessary
when inserting them into the vm_phys buddy allocator one-by-one), and
one to initialize the remaining non-zero fields and individually insert
each page into the allocator.
Merge the three passes into one following a suggestion from alc:
initialize vm_page fields in a single pass, and use vm_phys_free_contig()
to efficiently insert physical memory segments into the buddy allocator.
This reduces the initialization time to a third or a quarter of what it
was before on most systems that I tested.
Reviewed by: alc, kib
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D12248
17h supports MCA thresholding in the same way as 16h and earlier.
Supposedly a ScalableMca feature bit in CPUID 8000_0007:EBX must be set, but
that was not true for earlier models, so be careful about relying on it.
While here, document a missing bit in LS MCA MISC0.
Reviewed by: truckman
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12237
It doesn't seem necessary to busy the CPU while waiting to transition
into a different p-state.
PR: 221621 (related, but does not completely address)
Reviewed by: truckman
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12260
values. As not all assemblers understand the new ID_AA64MMFR2_EL1 register
add a macro to access it. This seems to be safe for older CPUs to read this
new register, with them returning zero.
Sponsored by: DARPA, AFRL
We need to extend the -Wno-format hack to yet another Makefile to cope
with %S meaning (CHAR16 *) not (wchar_t *) in the context of the EFI
boot loaders.
Sponsored by: Netflix
Rename boot1's wcslen to ucs2len, which we can't use in userland
because wchar in userland is unsigned, not short. Move it into
efichar.c. Also spell '* 2' as '* sizeof(efi_char)' and add 1 for the
trailing NUL to transition the FreeBSD boot env vars to being NUL
terminated on the same line...
Sponsored by: Netflix
When bcopy is treated as memcpy/memmove, Clang produces warnings that the
size argument doesn't match the type of the source. This is true, it
doesn't match; we're aliasing the source.
Explicitly cast the source pointer to the expected type to remove the
warning.
No functional change.
Sponsored by: Dell EMC Isilon
and convert any messages of types SCM_BINTIME, SCM_TIMESTAMP,
SCM_REALTIME and SCM_MONOTONIC from 64-bit to its 32-bit
representation. Otherwise we either run out of user-supplied
buffer to copy those out resulting in the MSG_CTRUNC or simply
return values that the userland 32-bit code is not going
to parse correctly. This fixes at least two regression tests
failing to function properly in 32-bit compat mode:
tools/regression/sockets/udp_pingpong
tools/regression/sockets/unix_cmsg
PR: kern/222039
MFC after: 30 days
While __read_mostly groups variables together, their placement is not
specified. In particular 2 frequently used variables can end up in
different lines.
This annotation is only expected to be used for variables read all the time,
e.g. on each syscall entry.
MFC after: 1 week
While these locks are guarnteed to not share their respective cache lines,
their current placement leaves unnecessary holes in lines which preceeded them.
For instance the annotation of vm_page_queue_free_mtx allows 2 neighbour
cachelines (previously separate by the lock) to be collapsed into 1.
The annotation is only effective on architectures which have it implemented in
their linker script (currently only amd64). Thus locks are not converted to
their not-padaligned variants as to not affect the rest.
MFC after: 1 week
hsi_struct_def.h file contains all firmware (HWRM) data struct's, updated
that with the latest one which was released on 30'th Aug.
After this upgrade, HWRM version will be 1.8.1.5 (earlier it was 1.4.0).
Submitted by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by: shurd, sbruno
Approved by: sbruno (mentor)
Sponsored by: Broadcom Limited
Differential Revision: https://reviews.freebsd.org/D12203
1) Based on the suggestion from firmware team, derive
scctx->isc_ntxqsets_max & scctx->isc_nrxqsets_max based on FUNC_QCFG
(instead of FUNC_QCAPS).
2) Bump-up driver version to "1.0.0.2".
Submitted by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by: shurd, sbruno
Approved by: sbruno (mentor)
Sponsored by: Broadcom Limited
Differential Revision: https://reviews.freebsd.org/D12128
In swp_pager_meta_build(), if the requested operation results in
freeing the last swap pointer in the swblk, free the trie node. Other
swap pager code does not expect to find completely empty swblk.
Reviewed by: alc, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
swapblk for our index while we dropped the object lock.
Noted by: jeff
Reviewed by: alc, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
report extended media types.
lacp_aggregator_bandwidth() uses the media to determine the speed of the
interface and returns 0 for IFM_OTHER without the bits in the extended
range.
Reported by: kbowling@
Reviewed by: eugen_grosbein.net, mjoras@
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D12188
transmit queues aswell as non-ratelimited ones.
Add the required structure bits in order to support a backpressure
indication with ratelimited connections aswell as non-ratelimited
ones. The backpressure indicator is a value between zero and 65535
inclusivly, indicating if the destination transmit queue is empty or
full respectivly. Applications can use this value as a decision point
for when to stop transmitting data to avoid endless ENOBUFS error
codes upon transmitting an mbuf. This indicator is also useful to
reduce the latency for ratelimited queues.
Reviewed by: gallatin, kib, gnn
Differential Revision: https://reviews.freebsd.org/D11518
Sponsored by: Mellanox Technologies
Turn on the required options in the ERL config file, and ensure
that the fbt module is listed as a dependency for mips in
the modules/dtrace/dtraceall/dtraceall.c file.
PR: 220346
Reviewed by: gnn, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D12227
Similar to r323195, but for amdsmn(4) driver (which borrowed some design).
Ignore hostbs that do not match our PCI device id criteria.
Sponsored by: Dell EMC Isilon
Some systems have hostbs that do not match our PCI device id criteria.
Detect and ignore these devices in probe.
PR: 218264
Sponsored by: Dell EMC Isilon
"zfs mount -o" passes a list of mount options directly to nmount(2) after
sanity checking them. In particular, zfs(8) will refuse to mount an already
existing file system unless "remount" is specified in the option list.
However, the "remount" option only exists in Illumos. FreeBSD's equivalent is
"update".
PR: 221985
Reviewed by: avg
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D12233
The sensor value is formatted similarly to previous models (same
bitfield sizes, same units), but must be read off of the internal
System Management Network (SMN) from the System Management Unit (SMU)
co-processor.
PR: 218264
Reported and tested by: Nils Beyer <nbe AT renzel.net>
Reviewed by: avg (no +1), mjoras, truckman
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12217
AMD Family 17h CPUs have an internal network used to communicate between
the host CPU and the PSP and SMU coprocessors. It exposes a simple
32-bit register space.
Reviewed by: avg (no +1), mjoras, truckman
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12217
It would be better to fix API consumers to not pass NULL there - most of them,
such as gmirror, already contain the neccessary checks - but this is easier
and much less error-prone.
One known user-visible result is that it fixes panic on a failed "graid label".
PR: 221846
MFC after: 2 weeks
Sponsored by: DARPA, AFRL