We drop the keg lock when we go to actually allocate the slab, allowing
other threads to advance the cursor. This can cause us to exit the
round-robin loop before having attempted allocations from all domains,
resulting in a hang during a subsequent blocking allocation attempt from
a depleted domain.
Reported and tested by: Jan Bramkamp <crest@bultmann.eu>
Reviewed by: alc, cem
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17209
The amd64 kernel started using ifunc for a variety of functions with
arch-specific implementations, and we would like to make use of the
same functionality on i386 and as much as possible avoid divergence
between i386 and amd64. In particular, future changes for security
improvements and mitigations may rely on ifunc support.
Approved by: re (kib)
Sponsored by: The FreeBSD Foundation
Limits can be safely obtained with lim_cur from the thread. racct is compiled
in but disabled by default. Note that racct enablement is a boot-only tunable.
This eliminates second most common place of taking the lock while pkg building.
While here don't take the lock in mlockall either.
Reviewed by: kib
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17210
State check before enqueuing transmit task in bxe_link_attn() routine.
State check before invoking bxe_nic_unload in bxe_shutdown().
Submitted by:Vaishali.Kulkarni@cavium.com
Approved by:re(gjb)
Doing so can deadlock when the thread already owns another vnode lock,
e.g. during a rename, as was demonstrated by the reporter. In fact,
there seems to be no need to force the call to getinoquota() always,
because vn_open() locks vnode exclusively, and this is the most
important case. To add to the point, directories where the dirent is
added or removed, are locked exclusively as well.
Reported by: bwidawsk
Tested by: bwidawsk, pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
MFC after: 1 week
The atpic_register_sources callback tries to avoid registering interrupt
sources that would collide with an I/O APIC. However, the previous
implementation was failing to register IRQs 8-15 since the slave PIC
saw valid IRQs from the master and assumed an I/O APIC was present. To
fix, go back to registering all 8259A interrupt sources in one loop when
the master's register_sources method is invoked.
PR: 231291
Approved by: re (kib)
MFC after: 1 month
Also change the behaviour slightly: instead of freeing "config" if the
last nvlist doesn't pass the tests, return the last config that did pass
those tests. This matches the comment at the beginning of the function.
PR: 230704
Diagnosed by: avg
Reviewed by: asomers, avg
Tested by: Mark Martinec <Mark.Martinec@ijs.si>
Approved by: re (gjb)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D17202
Patch removes all checks for pti/pcid/invpcid from the context switch
path. I verified this by looking at the generated code, compiling with
the in-tree clang. The invpcid_works1 trick required inline attribute
for pmap_activate_sw_pcid_pti() to work.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
Differential revision: https://reviews.freebsd.org/D17181
There is no need to use %rax for temporary values and avoiding doing
so shortens the func.
Handle the explicit 'check for tail' depessimisization for backwards copying.
This reduces the diff against userspace.
Tested with the glibc test suite.
Approved by: re (kib)
This will be used in following conversion of pmap_activate_sw().
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
Differential revision: https://reviews.freebsd.org/D17181
There is a braino in the non-erms variant which breaks the
functionality.
Will be fixed at a later time with a different patch.
Reported by: Manfred Antar
Approved by: re (implicit)
There is no need to use %rax for temporary values and avoiding doing
so shortens the func.
Handle the explicit 'check for tail' depessimisization for backwards copying.
This reduces the diff against userspace.
Approved by: re (kib)
when there is work to do. This reduces CPU consumption to one
third on systems. This will help keep the thread CPU usage under
control now that the default hash size has increased.
Reviewed by: kp
Approved by: re (kib)
Differential Revision: https://reviews.freebsd.org/D17097
Intel docs claim such a memset (rep stosb + 4096 bytes) is
special-cased by microarchs. They also switched Linux to use
it for this purpose.
Approved by: re (gjb)
Not all event descriptions have a sample rate (such as inst_retired.any)
this will restore the legacy behavior of using 65536 in that case. It also
prevents accidental API misuse that could lead to panic.
PR: 230985
Reported by: markj
Reviewed by: markj
Approved by: re (gjb)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D16958
We ought to be consistent across our Tier-1 and nearly-Tier-1
architectures, so enable Capsicum for 32-bit armv6/armv7 by default.
PR: 204008
Reviewed by: ian, oshogbo
Approved by: re (gjb)
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17023
The previous default of "balanced" appears to have caused pathological
behavior, including very poor performance and 100% CPU load in the
arc_reclaim_thread.
The symptoms appeared when the daily periodic run started.
With this change, the system--and the ARC in particular--behaved
normally during a manual daily periodic run.
From Mark Johnston: The port of the balanced strategy is incomplete,
since arc_prune_async() is a no-op on FreeBSD. (This also seems
to imply that r337653 is a no-op.) After 12 is branched we can
port the remaining bits and consider changing the default back.
Submitted by: markj (essentially)
Reviewed by: markj
Approved by: re (gjb)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17156
r338642 toggled the REPRODUCIBLE_BUILD knob but missed the
corresponding kern.opts.mk change.
We want to build the 12.0 release artifacts with reproducible builds
mode enabled. Switch it on in HEAD now to enable testing with upcoming
ALPHA builds. We can revisit the default setting for HEAD after the
branch is created.
This change eliminates the build metadata (user, hostname, timestamp,
etc.) from the kernel and loader. If the src tree is a git, svn or p4
checkout with changes then the metadata is retained.
The WITHOUT_REPRODUCIBLE_BUILD src.conf(5) knob can be used to revert
to the previous behaviour.
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Both drivers use this interface so add a dependancy on it.
Since awg uses aw_sid for generating the MAC address, make it
depend on both aw_sid and nmvem so when only removing nvmem from
kernel config it will not include this driver.
Reported by: sbruno
Approved by: re (gjb)
The Xen page-table walker used to resolve the virtual addresses in the
hypercalls will refuse to access user-space pages when SMAP is enabled
unless the AC flag in EFLAGS is set (just like normal hardware with
SMAP support would do).
Since privcmd allows forwarding hypercalls (and buffers) from
user-space into Xen make sure SMAP is temporary disabled for the
duration of the hypercall from user-space.
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
Register interrupts using the PIC pic_register_sources method instead
of doing it in apic_setup_io. This is now required, since the internal
interrupt structures are not yet setup when calling apic_setup_io.
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
Instead of panicking. Legacy PVH mode doesn't provide a lapic, and
since native_lapic_intrcnt is called unconditionally this would cause
the assert to trigger. Change the assert into a continue in order to
take into account the possibility of systems without a lapic.
Reviewed by: jhb
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D17015
The recommended way to obtain the vcpu id is using the cpuid
instruction with a specific leaf value. This leaf value must be
obtained at runtime, and it's done when populating the hypercall page.
Legacy PVH however will get the hypercall page populated by the
hypervisor itself before booting, so the cpuid leaf was not actually
set, thus preventing setting the vcpu id value from cpuid.
Fix this by making sure the cpuid leaf has been probed before
attempting to set the vcpu id.
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
That's the only mode in FreeBSD that requires the usage of PIRQs, so
there's no need to attach the PIRQ PIC when running in other modes.
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
When adding support for the new PVH mode the kenv handling was
switched to use a boot time allocated scratch space, however the
legacy PVH early boot code was not modified to allocate such space.
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
The vcpu_id for legacy PVH mode can be set from the output of cpuid,
so there's no need to have a special function to set it.
Also note that xenpv_set_ids should have been executed only for PV
guests, but was executed for all guests types and vcpu_id was later
fixed up for HVM guests.
Reported by: cperciva
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
So that it's done when the vcpu_id has been set. For the BSP the
vcpu_id is set at SUB_INTR, while for the APs it's done in
init_secondary_tail that's called at SUB_SMP order FIRST.
Reported and tested by: cperciva
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D17013
When running as a specific type of Xen guest the hypervisor won't
provide any emulated IO-APICs or legacy PICs at all, thus hitting the
following assert in the MSI code:
panic: Assertion num_io_irqs > 0 failed at /usr/src/sys/x86/x86/msi.c:334
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff826ffa70
vpanic() at vpanic+0x1a3/frame 0xffffffff826ffad0
panic() at panic+0x43/frame 0xffffffff826ffb30
msi_init() at msi_init+0xed/frame 0xffffffff826ffb40
apic_setup_io() at apic_setup_io+0x72/frame 0xffffffff826ffb50
mi_startup() at mi_startup+0x118/frame 0xffffffff826ffb70
start_kernel() at start_kernel+0x10
Fix this by removing the assert in the MSI code, since it's possible
to get to the MSI initialization without having registered any other
interrupt sources.
Reviewed by: jhb
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D17001
Or else it triggers the following bug:
APIC: CPU 6 has ACPI ID 6
APIC: CPU 7 has ACPI ID 7
panic: vm_wait in early boot
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff826ff8d0
vpanic() at vpanic+0x1a3/frame 0xffffffff826ff930
panic() at panic+0x43/frame 0xffffffff826ff990
vm_wait_domain() at vm_wait_domain+0xf9/frame 0xffffffff826ff9c0
kmem_alloc_contig_domain() at kmem_alloc_contig_domain+0x252/frame 0xffffffff826ffa50
kmem_alloc_contig() at kmem_alloc_contig+0x6c/frame 0xffffffff826ffad0
contigmalloc() at contigmalloc+0x2e/frame 0xffffffff826ffb00
x86bios_modevent() at x86bios_modevent+0x225/frame 0xffffffff826ffb20
module_register_init() at module_register_init+0xc0/frame 0xffffffff826ffb50
mi_startup() at mi_startup+0x118/frame 0xffffffff826ffb70
start_kernel() at start_kernel+0x10
While there also make x86bios_unmap_mem idempotent.
Reviewed by: kib
Approved by: re (gjb)
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D17000
* Fix a bug where the SYN handling during established state was
applied to a front state.
* Move a check for retransmission after the timer handling.
This was suppressing timer based retransmissions.
* Fix an off-by one byte in the sequence number of retransmissions.
* Apply fixes corresponding to
https://svnweb.freebsd.org/changeset/base/336934
Reviewed by: rrs@
Approved by: re (kib@)
MFC after: 1 month
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D16912
relocation.
elf_relocaddr() has a hook to handle VIMAGE data addresses.
This fixes VIMAGE support for RISC-V when built as a module.
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
Similar to arm64, riscv compiler uses PC-relative loads/stores,
and with static data compiler does not emit relocations.
In result, kernel module linker has nothing to fix and data accessed
from the wrong location.
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
newvers.sh supports two modes for reproducible builds:
-r Reproducible build. Do not embed directory names, user
names, time stamps or other dynamic information into
the output file. This is intended to allow two builds
done at different times and even by different people on
different hosts to produce identical output.
-R Reproducible build if the tree represents an unmodified
checkout from a version control system. Metadata is
included if the tree is modified.
Switch to the second mode when reproducible builds are enabled.
The value of a reproducible build is much less when building from an
uncontrolled, modified src tree, and -R likely provides the best
compromise in allowing the REPRODUCIBLE_BUILD knob to be enabled by
default for the release.
Approved by: re (kib)
Sponsored by: The FreeBSD Foundation
From Piotr:
ix(4), ixv(4): Add VLAN tag strip check when receiving packets
ixv(4): Fix support for VLAN_HWTAGGING and VLAN_HWFILTER flags
This change will prevent driver from passing VLAN tags when
interface configuration is not expecting them. VF driver will
check for VLAN_HWTAGGING and VLAN_HWFILTER flags and act adequately.
This patch resolves problem occuring on EC2 platforms.
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reported by: cperciva@
Reviewed by: cperciva@, Intel Networking
Approved by: re
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D17061
of the fix/workaround for the "ctld hanging on reload" problem.
PR: 220175
Reported by: Eugene M. Zheganin <emz at norma.perm.ru>
Tested by: Eugene M. Zheganin <emz at norma.perm.ru>
Approved by: re (kib)
MFC after: 2 weeks
Sponsored by: playkey.net