While experimenting with changing boolean_t to another type, I noticed
that several powerpc pmap related functions returned the wrong type:
boolean_t instead of int.
Fix several declarations and definitions to match the actual pmap
function types: pmap_dev_direct_mapped_t and pmap_ts_referenced_t.
MFC after: 3 days
When vm_map_remove() is called from vm_swapout_map_deactivate_pages()
due to swapout, PKRU attributes for the removed range must be kept
intact. Provide a variant of pmap_remove(), pmap_map_delete(), to
allow pmap to distinguish between real removes of the UVA mappings
and any other internal removes, e.g. swapout.
For non-amd64, pmap_map_delete() is stubbed by define to pmap_remove().
Reported by: andrew
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39556
Use them when possible, instead of separated flags.
No functional change intended.
Reviewed by: hselasky, erj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39466
NETLINK is going to replace rtsock and a number of other ioctl/sysctl interfaces.
In-base utilies such as route(8), netstat(8) and soon ifconfig(8)
are being converted to use netlink sockets as a transport between
kernel and userland.
In the current configuration, it still possible have the kernel
without NETLINK (`nooptions NETLINK`) and use the aforementioned
utilies by buidling the world with `WITHOUT_NETLINK` src.conf knob.
However, this approach does not cover the cases when person unintentionally
builds a custom kernel without netlink and tries to use the standard userland.
This change adds `option NETLINK` to the default options for each
architecture, fixing the custom kernel issue.
For arm, this change uses `std.armv6` and `std.armv7` (netlink already in)
instead of DEFAULTS.
Reviewed By: imp
Differential Revision: https://reviews.freebsd.org/D39339
It's apparently possible for pcpu->pc_curpmap to be NULL at some point,
leading to a panic. Account for this as is done with the other 64-bit
AIM pmap.
Reported by: pkubaj
Tested by: pkubaj
Fixes: 6f0b2a235a ("Add pmap_sync_icache() for radix pmap")
MFC after: 3 days
Make a pass at the various nexus implementations, fixing some very minor
style issues, obsolete comments, etc.
Update the top-level comment to be closer to other nexus
implementations.
The method declaration section has become unwieldy in many respects.
Attempt to tame it by:
- Using generated method typedefs
- Grouping methods roughly by category, and then alphabetically.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D38495
DTrace pid provider writes to user space to set breakpoints. Failing to
sync the icache can lead to SIGTRAP. Radix pmap is the only one missing
a pmap_sync_icache() method, so the pid provider would only potentially
crash a process on a POWER9 or later system.
Summary:
This review ports mlx5 driver, kernel's OFED stack (userland is already enabled), KTLS and krping to powerpc64 and powerpc64le.
krping requires a small change since it uses assembly for amd64 / i386.
NOTE: On powerpc64le RDMA works fine in the userspace with libmlx5, but on powerpc64 it does not. The problem is that contrib/ofed/libmlx5/doorbell.h checks for SIZEOF_LONG but this macro exists on neither powerpc64* nor amd64. Thus, the file silently goes to the fallback function written for 32-bit architectures. It works fine on little-endian architectures, but causes a hard fail on big-endian. It's possible it may also cause some runtime issues on little-endian.
Thus, on powerpc64 I verified that RDMA works with krping.
Reviewers: #powerpc, hselasky
Subscribers: bdrewery, imp, emaste, jhibbits
Differential Revision: https://reviews.freebsd.org/D38786
Summary:
This review ports mlx5 driver, kernel's OFED stack (userland is already enabled), KTLS and krping to powerpc64 and powerpc64le.
krping requires a small change since it uses assembly for amd64 / i386.
NOTE: On powerpc64le RDMA works fine in the userspace with libmlx5, but on powerpc64 it does not. The problem is that contrib/ofed/libmlx5/doorbell.h checks for SIZEOF_LONG but this macro exists on neither powerpc64* nor amd64. Thus, the file silently goes to the fallback function written for 32-bit architectures. It works fine on little-endian architectures, but causes a hard fail on big-endian. It's possible it may also cause some runtime issues on little-endian.
Thus, on powerpc64 I verified that RDMA works with krping.
Reviewers: #powerpc, hselasky
Subscribers: bdrewery, imp, emaste, jhibbits
Differential Revision: https://reviews.freebsd.org/D38786
Most options in kernel config files use "options<space><tab>OPTION".
This allows the option to be commented out without shifting columns.
A few options had two tabs, and some had spaces. Make them consistent.
Very old versions of gcc defined _BIG_ENDIAN and _LITTLE_ENDIAN. So to
work around that, we undefined them here. However, that causes problems
for programs that do:
(and many other variations on that theme). Since this often is the
result of weirdly nested includes in the ports world that are hard to
unwind, drop this workaround to help more ports build out of the box.
If there's still an issue here (and my testing hasn't shown it), we'll
fix the issue in a brand-new way once I have a reproducer.
This fixes the mesa-devel build, and others
Sponsored by: Netflix
Tested by: pkubaj
MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D38564
Rather than using the DEVICE_IDENTIFY method, let's have other
ofwbus-using platforms add ofwbus0 explicitly in nexus, like arm64. This
gives them the same flexibility, e.g. if riscv starts supporting ACPI,
and cleans up the #ifdefs.
We were doing this already on riscv, but adjust the 'order' parameters.
Reviewed by: andrew, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D38492
The architecture nexus should handle allocation and release of memory and
interrupts. This is to ensure that system-wide resources such as these
are available to all devices, not just children of ofwbus0.
On powerpc this moves the ownership of these resources up one level,
from ofwbus0 to nexus0. Other architectures already have the required
logic in their nexus implementation, so this eliminates the duplication
of resources. An implementation of nexus_adjust_resource() is added for
arm, arm64, and riscv.
As noted by ian@ in the review, resource handling was the main bit of
logic distinguishing ofwbus from simplebus. With some attention to
detail, it should be possible to merge the two in the future.
Co-authored by: mhorne
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D30554
This is a followup of 692e19cf51 (add netlink to GENERIC@amd64).
Netlink is a communication protocol defined in RFC 3549. It is async,
TLV-based protocol, providing 1-1 and 1-many communications between kernel
and userland. Netlink is currently used in Linux kernel to modify, read and
subscribe for nearly all networking states. Interface state, addresses, routes,
firewall, rules, fibs, etc, are controlled via Netlink.
Netlink support was added in D36002. It has got a number of improvements and
first customers since then:
* net/bird2 got netlink support, enabling route multipath in FreeBSD
* netlink-based devd notifications are being worked on ( D37574 ).
* linux(4) fully supports and depends on Netlink
Enabling Netlink in GENERIC targets two goals.
The first one is to provide stability for the third-party userland applications,
so they can rely on the fact that netlink always exists since 14.0 and potentially 13.2.
Loadable module makes life of the app delepers harder. For example, `net/bird2` can be
either build with netlink or rtsock support, but not both.
The second goal is to enable gradual conversion of the base userland tools
to use netlink(4) interfaces. Converting tools like netstat (D36529), route,
ifconfig one-by-one simplifies testing and addressing the feedback.
Othewise, switching all base to use netlink at once may be too big of a leap.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D37783
Remove the platform-specific definitions of VM_BATCHQUEUE_SIZE
for amd64 and powerpc64, and instead treat all 64-bit platforms
identically. This has the effect of increasing the arm64
and riscv VM_BATCHQUEUE_SIZE to match that of other platforms.
Reviewed by: jhb, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37707
Rather than waiting until the batchqueue is full to acquire the lock &
process the queue, we now start trying to acquire the lock using trylocks
when the batchqueue is 1/2 full. This removes almost all contention on the
vm pagequeue mutex for for our busy sendfile() based web workload.
It also greadly reduces the amount of time a network driver ithread
remains blocked on a mutex, and eliminates some packet drops under
heavy load.
So that the system does not loose the benefit of processing large
batchqueues, I've doubled the size of the batchqueues. This way, when
there is no contention, we process the same batch size as before.
This has been run for several months on a busy Netflix server, as well
as on my personal desktop.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37305
Flags should be M_WAITOK | M_ZERO instead of just M_ZERO
Reviewed by: markj
MFC after: 2 days
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D36865
This matches the return type of pmap_mapdev/bios.
Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548
This changes the default TCP Congestion Control (CC) to CUBIC.
For small, transactional exchanges (e.g. web objects <15kB), this
will not have a material effect. However, for long duration data
transfers, CUBIC allocates a slightly higher fraction of the
available bandwidth, when competing against NewReno CC.
Reviewed By: tuexen, mav, #transport, guest-ccui, emaste
Relnotes: Yes
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D36537
This applies one of the changes from
5567d6b441 to other architectures
besides arm64.
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36263
With clang 15, the following -Werror warnings are produced:
sys/powerpc/booke/mp_cpudep.c:54:20: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
cpudep_ap_bootstrap()
^
void
sys/powerpc/booke/mp_cpudep.c:97:16: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
cpudep_ap_setup()
^
void
This is because cpudep_ap_bootstrap() and cpudep_ap_setup() are declared
with (void) argument lists, but defined with empty argument lists. Make
the definitions match the declarations.
MFC after: 3 days
With clang 15, the following -Werror warning is produced:
sys/powerpc/aim/moea64_native.c:306:22: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
moea64_install_native()
^
void
This is because moea64_install_native() is declared with a (void)
argument list, but defined with an empty argument list. Make the
definition match the declaration.
MFC after: 3 days
With clang 15, the following -Werror warning is produced:
sys/powerpc/aim/mmu_radix.c:5409:22: error: variable 'freed' set but not used [-Werror,-Wunused-but-set-variable]
int allfree, field, freed, idx;
^
The 'freed' variable is only used when PV_STATS is defined. Ensure it is
only declared and set in that case.
MFC after: 3 days
With clang 15, the following -Werror warnings are produced:
sys/powerpc/aim/mmu_radix.c:786:20: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
mmu_radix_tlbie_all()
^
void
sys/powerpc/aim/mmu_radix.c:3615:15: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
mmu_radix_init()
^
void
sys/powerpc/aim/mmu_radix.c:6081:20: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
mmu_radix_scan_init()
^
void
This is because mmu_radix_tlbie_all(), mmu_radix_init(), and
mmu_radix_scan_init() are declared with (void) argument lists, but
defined with empty argument lists. Make the definitions match the
declarations.
MFC after: 3 days
With clang 15, the following -Werror warnings are produced:
sys/powerpc/aim/mmu_oea64.c:1947:12: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
moea64_init()
^
void
sys/powerpc/aim/mmu_oea64.c:3257:17: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
moea64_scan_init()
^
void
This is because moea64_init() and moea64_scan_init() are declared with
(void) argument lists, but defined with empty argument lists. Make the
definitions match the declarations.
MFC after: 3 days
With clang 15, the following -Werror warning is produced:
sys/powerpc/aim/aim_machdep.c:750:14: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
mpc745x_sleep()
^
void
This is because mpc745x_sleep() is declared with a (void) argument list,
but defined with an empty argument list. Make the definition match the
declaration.
MFC after: 3 days
Since the likelihood of new Book-E PowerPC SoCs being produced is near
zero clamp MAXCPU to around the highest number of cores/threads
available currently, for both 64-bit and 32-bit. Even though the
current highest core/thread count is 24, the cap is set at 32 in case
there is code that assumes power of 2.
The CAM 'maxio' is a 'pessimized' size, assuming 4k pages and one page
per segment. Since there are at most 63 segments in a transaction with
this driver, and one would necessarily be the indirect segment marker,
clamp the maxio to the minimum of maxphys (tunable) or (63 - 1) pages
(248k).
MFC after: 3 days
Make powerpspe kernel config in sync with other targets making
GEOM_LABEL built-in to allow use of labels when mounting partitions.
MFC after: 2 days
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Make most AST handlers dynamically registered. This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it. For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.
Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit. For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.
The dynamic registration also allows third-party modules to register AST
handlers if needed. There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it. In fact, this
is already present behavior for hwpmc.ko and ufs.ko. I do not think it
is worth the efforts and the runtime overhead to try to fix it.
Reviewed by: markj
Tested by: emaste (arm64), pho
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
Sometimes we need to reset a PCIe bus, but sometimes it breaks the
downstream device(s). Since, from my testing, this is only needed for
Radeon cards installed in the AmigaOne machines because the card was
already initialized by firmware, make the reset dependent on a device
hint (hint.pcib.X.reset=1). With this, AmigaOne X5000 machines can have
other devices in the secondary PCIe slots.
The devmap variants used vm_offset_t for some reason, and a few places
explicitly cast bus addresses to vm_offset_t. (Probably those casts
along with similar casts for vm_size_t should just be removed and
instead permit the compiler to DTRT.)
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D35961
Store the shared page address in struct vmspace.
Also instead of storing absolute addresses of various shared page
segments save their offsets with respect to the shared page address.
This will be more useful when the shared page address is randomized.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35393
Use a getter macro instead of fetching the sigcode address directly
from a sysent of a given process. It assumes that the sigcode is stored
in the shared page, which is true in all cases, except for a.out
binaries. This will be later useful when the shared page address
randomization is introduced.
No functional change intended.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35392
Stop including the current CPU in all event messages, since it's already
saved in KTR log entries and thus is redundant. All eventtimer traces
occur in a context where CPU migration is not possible.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Partially revert the previous change; we need to keep this method as a
specific override for pci_driver subclasses which should not use
pci_rescan_method() -- cardbus and ofw_pcibus. However, change the return
value to ENODEV for the same reasoning given in the original commit, and
use this as the default rescan method in bus_if.m.
Reported by: jhb
Fixes: 36a8572ee8 ("bus_if: provide a default null rescan method")
MFC with: 36a8572ee8
The third argument to this function indicates whether the supplied
ticker is fixed or variable, i.e. requiring calibration. Give this
argument a type and name that better conveys this purpose.
Reviewed by: kib, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35459
There is an existing helper method in subr_bus.c, but almost no drivers
know to use it. It also returns the same error as an empty method,
making it not very useful. Move this to bus_if.m and return a more
sensible error code.
This gives a slightly more meaningful error message when attempting
'devctl rescan' on buses and devices alike:
"Device not configured" --> "Operation not supported by device"
Reviewed by: imp
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35501