From Jake:
Vendor drivers that exist out-of-tree generally should return
BUS_PROBE_VENDOR from their device probe functions. This helps ensure
that a vendor replacement driver will supersede the in-kernel driver for
a given device.
Currently, if a vendor wants to implement a driver based on iflib, it
will always report BUS_PROBE_DEFAULT.
Add a wrapper function, iflib_device_probe_vendor() which can be used in
place of iflib_device_probe(). This function will just return
BUS_PROBE_VENDOR whenever iflib_device_probe() would return
BUS_PROBE_DEFAULT.
While vendor drivers can already implement such a wrapper themselves,
providing it in the iflib.h header makes it easier for the vendor driver
to do the right thing.
Submitted by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed by: erj@, gallatin@, marius@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D20221
Due to an attempt to check two conditions at once in a macro not designed
as such, the assertion would always evaluate to true.
#define VERIFY3_IMPL(LEFT, OP, RIGHT, TYPE) do { \
const TYPE __left = (TYPE)(LEFT); \
const TYPE __right = (TYPE)(RIGHT); \
if (!(__left OP __right)) \
assfail3(#LEFT " " #OP " " #RIGHT, \
(uintmax_t)__left, #OP, (uintmax_t)__right, \
__FILE__, __LINE__); \
_NOTE(CONSTCOND) } while (0)
#define ASSERT3U(x, y, z) VERIFY3_IMPL(x, y, z, uint64_t)
Mean that we compared:
left = (type == ZIO_TYPE_FREE || psize)
OP = "<="
right = (SPA_MAXBLOCKSIZE)
If the type was not FREE, 0 is less than SPA_MAXBLOCKSIZE (16MB)
If the type is ZIO_TYPE_FREE, 1 is less than SPA_MAXBLOCKSIZE
The constraint on psize (physical size of the FREE operation) is never
checked against SPA_MAXBLOCKSIZE
Reported by: Ka Ho Ng <khng300@gmail.com>
Reviewed by: kevans
MFC after: 2 weeks
Sponsored by: Klara Systems
the code is not guarded by the if clause and has misleading indentation.
Approved by: scottl
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20427
PT_ATTACH was consumed.
In particular, do not clear TDP_FSTP in ptracestop() if td_wchan is
non-NULL. Leave it to sleepq_catch_signal() to clear and convert zero
return code to EINTR.
Otherwise, per submitter report, if the PT_ATTACH SIGSTOP was
delivered right after the thread was added to the sleepqueue but not
yet really sleep, and cursig() caused debugger attach, the thread
sleeps instead of returning to the userspace boundary with EINTR.
PR: 231445
Reported by: Efi Weiss <valmarelox@gmail.com>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20381
Since drm2 removal, there has not been any consumer of the feature in the
tree. I am also unaware of any out-of-tree consumer.
More importantly, the feature has been broken from the very start, both
before and after r306589, because the ivar was set on a device that does
not support it and it was read from another device that also does not
support it.
A bus-wide no-stop flag cannot be implemented as an ivar as iicbus
attaches as a child of various drivers. Implementing the ivar in each
and every I2C driver is just impractical.
If we ever want to implement this feature properly, then probably the
easiest way to do it would be via a flag in the softc of iicbus.
In fact, we might have to do that in the stable branches if we want to
fix the code for them.
Reported by: ian (long time ago)
MFC after: 1 month (maybe)
X-MFC-note: cannot just merge the change, must keep drm2 happy
Summary:
Due to missing relocation support in libdwarf for powerpc64, handling of dwarf
info on unlinked objects was bogus.
Examining raw dwarf data on objects compiled on ppc64 with a modern compiler
(in-tree gcc tends to hide the issue, since it only rarely generates relocations
in .debug_info and uses DW_FORM_str instead of DW_FORM_strp for everything), you
will find that the dwarf data appears corrupt, with repeated references to the
compiler version where things like types and function names should appear.
This happens because the 0 offset of .debug_str contains the compiler version,
and without applying the relocations, *all* indirect strings in .dwarf_info will
end up pointing to it.
This corruption then propogates to the CTF data, as ctfconvert relies on
libdwarf to read the dwarf info, for every compiled object (when building a
kernel.)
However, if you examine the dwarf data on a compiled executable, it will appear
correct, because during final link the relocations get applied and baked in by
the linker.
Submitted by: Brandon Bergren
Reviewed By: emaste
Differential Revision: https://reviews.freebsd.org/D20367
There were two remaining "gaps" in auditing local bridge traffic with
bpf(4):
Locally originated outbound traffic from a member interface is invisible to
the bridge's bpf(4) interface. Inbound traffic locally destined to a member
interface is invisible to the member's bpf(4) interface -- this traffic has
no chance after bridge_input to otherwise pass it over, and it wasn't
originally received on this interface.
I call these "gaps" because they don't affect conventional bridge setups.
Alas, being able to establish an audit trail of all locally destined traffic
for setups that can function like this is useful in some scenarios.
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19757
Users of pseudofs (e.g. lindebugfs), should be able to receive
input from command line via commands like "echo 1 > /path/to/file".
Currently this fails because sh tries to truncate the file first and
vop_setattr returns not supported error for this. This patch simply
ignores the error and returns 0 instead.
Reviewed by: imp (mentor), asomers
Approved by: imp (mentor), asomers
MFC after: 1 week
Differential Revision: D20451
As I see, different NICs in different configurations may have different
numbers of TX and RX queues. The code was assuming 1:1 mapping between
event queues (interrupts) and TX/RX queues. Since number of interrupts
is set to maximum of TX and RX queues, when those two are different, the
system is doomed.
I have no documentation or deep knowledge about this hardware, so this
change is based on general observations and code reading. If some of my
guesses are wrong, please do better. I just confirmed HP NC550SFP NICs
are working now.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
CID 1400451: case 0 is missing a break/return and falling through to the
default case. waitpid(0, ...) makes little sense in the child, we likely
wanted to terminate immediately.
CID 1400453: size argument uses sizeof(char **) instead of sizeof(char *)
and is assigned to a char **; sizeof's match but "this isn't a portable
assumption".
CID: 1400451, 1400453
MFC after: 3 days
cpufunc, in terms of __builtin_ffs and the like, for arm32 v6 and v7
architectures, and use those, rather than the simple libkern
implementations, in building arm32 kernels.
Reviewed by: manu
Approved by: kib, markj (mentors)
Tested by: iz-rpi03_hs-karlsruhe.de, mikael.urankar_gmail.com, ian
Differential Revision: https://reviews.freebsd.org/D20412
It appeared that using NET_EPOCH_WAIT() while holding global BPF lock
can lead to another panic:
spin lock 0xfffff800183c9840 (turnstile lock) held by 0xfffff80018e2c5a0 (tid 100325) too long
panic: spin lock held too long
...
#0 sched_switch (td=0xfffff80018e2c5a0, newtd=0xfffff8000389e000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2133
#1 0xffffffff80bf9912 in mi_switch (flags=256, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:439
#2 0xffffffff80c21db7 in sched_bind (td=<optimized out>, cpu=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2704
#3 0xffffffff80c34c33 in epoch_block_handler_preempt (global=<optimized out>, cr=0xfffffe00005a1a00, arg=<optimized out>)
at /usr/src/sys/kern/subr_epoch.c:394
#4 0xffffffff803c741b in epoch_block (global=<optimized out>, cr=<optimized out>, cb=<optimized out>, ct=<optimized out>)
at /usr/src/sys/contrib/ck/src/ck_epoch.c:416
#5 ck_epoch_synchronize_wait (global=0xfffff8000380cd80, cb=<optimized out>, ct=<optimized out>) at /usr/src/sys/contrib/ck/src/ck_epoch.c:465
#6 0xffffffff80c3475e in epoch_wait_preempt (epoch=0xfffff8000380cd80) at /usr/src/sys/kern/subr_epoch.c:513
#7 0xffffffff80ce970b in bpf_detachd_locked (d=0xfffff801d309cc00, detached_ifp=<optimized out>) at /usr/src/sys/net/bpf.c:856
#8 0xffffffff80ced166 in bpf_detachd (d=<optimized out>) at /usr/src/sys/net/bpf.c:836
#9 bpf_dtor (data=0xfffff801d309cc00) at /usr/src/sys/net/bpf.c:914
To fix this add the check to the catchpacket() that BPF descriptor was
not detached just before we acquired BPFD_LOCK().
Reported by: slavash
Tested by: slavash
MFC after: 1 week
KUBSAN was complaining the pointer contigmalloc_domainset returned was
misaligned. Fix this by using the correct argument to find the alignment
in the function signature.
Reported by: KUBSAN
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
This checks the alignment of a given pointer is sufficient for the
requested alignment asked for. This fixes the build with a recent
llvm/clang.
Sponsored by: DARPA, AFRL
* F_RDLCK, F_UNLCK, and F_WRLCK aren't flags. They're stored in the
fl.l_type field.
* Add F_REMOTE, added in r177633
* Add F_NOINTR, added in r180025
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
MDT_MODULE info is required to be ordered before any other MDT metadata for
a given kld because it serves as an implicit record boundary between
distinct klds for linker.hints consumers. kldxref(8) has previously relied
on the assumption that MDT_MODULE was ordered relative to other module
metadata in kld objects by source code ordering.
However, C does not require implementations to emit file scope objects in
any particular order, and it seems that GCC 6.4.0 and/or binutils 2.32 ld
may reorder emitted objects with respect to source code ordering.
So: just take two passes over a given .ko's module metadata, scanning for
the MDT_MODULE on the first pass and the other metadata on subsequent
passes. It's not super expensive and not exactly a performance-critical
piece of code. This ensures MDT_MODULE is always ordered before
MDT_PNP_INFO and other MDTs, regardless of compiler/linker movement. As a
fringe benefit, it removes the requirement that care be taken to always
order MODULE_PNP_INFO after DRIVER_MODULE in source code.
Reviewed by: emaste, imp
Differential Revision: https://reviews.freebsd.org/D20405
tables which affect demotion.
The last last-level page table under 2M mappings below KERNend was
only partially initialized. When that page was used as the hardware
page table for demotion of the 2M mapping, the result was not
consistent. Since pmap_demote_pde() is switched to use PG_PROMOTED as
the test for the validity of the saved last level page table page, we
can keep page table pages zero-initialized instead. Demotion would
fill them as needed.
Only map the created page tables beyond KERNend, there is no need to
pre-promote PTmap after KERNend, because the extra mapping is not used.
Only round up *firstaddr to 2M boundary when it is below rounded
KERNend. Sometimes the allocpages() calls advance *firstaddr past the
end of the last 2MB page mapping. In that case, this conditional
avoids wasting an average of 1MB of physical memory.
Update comments to explain action in more clean and direct language.
Reported and tested by: pho
In collaboration with: alc
Sponsored by: The FreeBSD Foundation (kib)
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D20380
bpf_mtap() can invoke catchpacket() for already detached descriptor.
And this can lead to NULL pointer dereference, since bd_bif pointer
was reset to NULL in bpf_detachd_locked(). To avoid this, use
NET_EPOCH_WAIT() when descriptor is removed from interface's descriptors
list. After the wait it is safe to modify descriptor's content.
Submitted by: kib
Reported by: slavash
MFC after: 1 week
Check the CTF magic number in big endian platforms. This lets DTrace FBT
handle types correctly on these platforms.
Submitted by: Brandon Bergren
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20413
'.' function names exist only in ELFv1. ELFv2 does away with function
descriptors, and look more like they do on powerpc(32) and most other
platforms, as direct function pointers. Stop blacklisting regular function
names in ELFv2.
Submitted by: Brandon Bergren
Differential Revision: https://reviews.freebsd.org/D20346
acpi_config_intr() will be called when an arm64 system booted with ACPI.
We do the interrupt mapping for ACPI interrupts in nexus_acpi_map_intr()
on arm64, so acpi_config_intr() has to just return success without
printing this error message.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D19432
received and support for time stamps was negotiated in the SYN/SYNACK
exchange, perform the PAWS check and only expand the syn cache entry if
the check is passed.
Without this check, endpoints may get stuck on the incomplete queue.
Reviewed by: jtl@
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D20374