verified under VMWare Fusion, comparing to what's reported under CentOS,
and by comparing numbers reported by linuxulator on T420 with a googled
up Linux cpuinfo (https://lkml.org/lkml/2011/11/29/116).
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20693
Rather than marking the read ahead/behind pages valid even though they were
not initialized, free them using the new function vm_page_free_invalid().
Reviewed by: markj, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25430
that might be wired. If the page is wired then it cannot be freed now,
but the thread that eventually unwires it will free it at that point.
Reviewed by: markj, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25430
ACL packet boundary flag should be 0 instead of 2 for LE PDU.
Some HCI will drop LE packet with PB flag is 2, and if sent,
some target may reject the packet.
PR: 248024
Reported by: Greg V
Reviewed by: Greg V, emax
Differential Revision: https://reviews.freebsd.org/D25704
This new function allows us to find the SMMU instance assigned
for a particular PCI RID.
Reviewed by: andrew
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D25687
The function is called from a KLD load handler, so it may sleep.
- Stop checking for errors from uma_zcreate(), they don't happen.
- Convert M_NOWAIT allocations to M_WAITOK.
- Remove error handling for existing M_WAITOK allocations.
- Fix style.
Reviewed by: cem, delphij, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25696
When emulating a load pair or store pair in dtrace on arm64 we need to
copy the data between the stack and trap frame. When the registers are
either the link register or the zero register we will access memory
past the end of the trap frame as these are encoded as registers 30 and
31 respectively while the array they access only has 30 entries.
Fix this by creating 2 helper functions to perform the operation with
special cases for these registers.
Sponsored by: Innovate UK
Subsequent to r240317, kmem_free() was replaced with kva_free() (r254025).
kva_free() releases the KVA allocation for the mapped region, but no longer
clears the pmap (pagetable) entries.
An affected pmap_unmapdev operation would leave the still-pmap'd VA space
free for allocation by other KVA consumers. However, this bug easily
avoided notice for ~7 years because most devices (1) never call
pmap_unmapdev and (2) on amd64, mostly fit within the DMAP and do not need
KVA allocations. Other affected arch are less popular: i386, MIPS, and
PowerPC. Arm64, arm32, and riscv are not affected.
Reported by: Don Morris <dgmorris AT earthlink.net>
Submitted by: Don Morris (amd64 part)
Reviewed by: kib, markj, Don (!amd64 parts)
MFC after: I don't intend to, but you might want to
Sponsored by: Dell Isilon
Differential Revision: https://reviews.freebsd.org/D25689
These routines are similar to crypto_getreq() and crypto_freereq() but
operate on caller-supplied storage instead of allocating crypto
requests from a UMA zone.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25691
In xpt_release_device(), callout_stop() was being called without
holding the mutex (send_mtx) that is used to protect the callout.
So, move the mtx_unlock() call so that it is protected.
MFC after: 1 week
Sponsored by: Spectra Logic
In r361752 an error handling was introduced for using 0.0.0.0 or
255.255.255.255 as the address in connect() for TCP, since both
addresses can't be used. However, the stack maps 0.0.0.0 implicitly
to a local address and at least two regressions were reported.
Therefore, re-allow the usage of 0.0.0.0.
While there, change the error indicated when using 255.255.255.255
from EAFNOSUPPORT to EACCES as mentioned in the man-page of connect().
Reviewed by: rrs
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D25401
If the hypervisor advertises support for the DISCARD command then the
guest can perform TRIM commands, freeing space on the backing store.
If VIRTIO_BLK_F_DISCARD is enabled, advertise DISKFLAG_CANDELETE
Tested with FreeBSD guests on bhyve and KVM
Reviewed by: jhb
Tested by: freqlabs
MFC after: 1 month
Relnotes: yes
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D21708
This removes SCTP from in-tree kernel configuration files. Now, SCTP
can be enabled by simply loading the module, as discussed on
freebsd-net@.
Reviewed by: tuexen
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25611
The code would unconditionally lock the vnode to audit or call the
mac hoook, even if neither want to do anything. Pre-check the state
to avoid locking in the common case of nothing to do.
Note this code should not be normally executed anyway as vnodes are
always return ready. However, poll1/2 from will-it-scale use regular
files for benchmarking, presumably to focus on the interface itself
as the vnode handler is not supposed to do almost anything.
This in particular fixes poll2 which passes 128 fds.
$ ./poll2_processes -s 10
before: 134411
after: 271572
if_attach() -> if_attach_internal() will call if_attachdomain1(ifp) any time
an ethernet interface is setup *after*
SI_SUB_PROTO_IFATTACHDOMAIN/SI_ORDER_FIRST. This eventually leads to
nd6_ifattach() -> nd6_setmtu0() stashing off ifp->if_mtu in ndi->maxmtu
*before* ifp->if_mtu has been properly set in some scenarios, e.g., USB
ethernet adapter plugged in later on.
For interfaces that are created in early boot, we don't have this issue as
domains aren't constructed enough for them to attach and thus it gets
deferred to domainifattach at SI_SUB_PROTO_IFATTACHDOMAIN/SI_ORDER_SECOND
*after* the mtu has been set earlier in ether_ifattach().
PR: 248005
Submitted by: Mathew <mjanelle blackberry com>
MFC after: 1 week
This shortens fdalloc by over 60 bytes. Correctness verified by running both
variants at the same time and comparing the result of each call.
Note someone(tm) should make a pass at converting everything else feasible.
The AR9341 AHB runs at 225MHz, much faster than the 33MHz of the
AR71xx AHB. So not only is the math going to do weird things, it
will also wrap rather than being clamped.
So:
* clamp! don't wrap!
* tidy up some debugging
* add an option to throw an NMI rather than reset!
Tested:
* AR9341 SoC (TP-Link TL-WDR4300), patting/not patting the watchdog!
Use PVO_PADDR macro to get the physical address from a PVO, instead of
explicitly ANDing pvo_pte.pa with LPTE_RPGN where it is needed. Besides
improving readability, this is needed to support superpages (D25237), where
the steps to get the PA from a PVO are different.
Reviewed by: markj
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D25654
It keeps recalculated way more often than it is needed.
Provide a routine (fdlastfile) to get it if necessary.
Consumers may be better off with a bitmap iterator instead.
The code in nfscl_dofflayout() loops when a flexible file layout server
provides a small write data limit (no extant server is known to do this).
If/when it looped, it erroneously reused the "drpc" argument for the
mirror worker thread, corrupting it.
This patch fixes the problem by only using the calling thread after the
first loop iteration.
Found during testing by simulating a server with a small write size.
Since no extant pNFS server is known to provide a small write size,
this fix it not needed in practice at this time.
MFC after: 2 weeks
pmc_cpuid was uninitialized for most AMD processor families. We can still
populate this string for unimplemented families.
Also added a CPUID_TO_STEPPING macro and converted existing code to use it.
Reviewed by: mav
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25673
That follows Linux and fixes related drm-kmod-5.3 panic.
Reviewed by: imp, hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25657
Previously it would check 4, 3, 2, 1 lists. In practice by the time
it is getting called all lists have some elements and consequently
this does not result in new evictions.
Nonetheless, the code is clearer.
Tested by: pho
.. and stuff if into the unused target vnode field
This gets rid of concurrent nc_flag modifications racing with the
shrinker and consequently fixes a bug where such a change could have
been missed when cache_ncp_invalidate was being issued..
Reported by: zeising
Tested by: pho, zeising
Fixes: r362828 ("cache: lockless forward lookup with smr")
Stop using smp_ipi_mtx to protect global shootdown state, and
move/multiply the global state into pcpu. Now each CPU can initiate
shootdown IPI independently from other CPUs. Initiator enters
critical section, then fills its local PCPU shootdown info
(pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu,
my_cpuid) for each target cpu. After that IPI is sent to all targets
which scan for zeroed scoreboard generation words. Upon finding such
word the shootdown data is read from corresponding cpu' pcpu, and
generation is set. Meantime initiator loops waiting for all zeroed
generations in scoreboard to update.
Initiator does not disable interrupts, which should allow
non-invalidation IPIs from deadlocking, it only needs to disable
preemption to pin itself to the instance of the pcpu smp_tlb data.
The generation is set before the actual invalidation is performed in
handler. It is safe because target CPU cannot return to userspace
before handler finishes. In principle only NMI can preempt the
handler, but NMI would see the kernel handler frame and not touch
not-invalidated user page table.
Handlers loop until they do not see zeroed scoreboard generations.
This, together with hardware keeping one pending IPI in LAPIC IRR
should prevent lost shootdowns.
Notes.
1. The code does protect writes to LAPIC ICR with exclusion. I believe
this is fine because we in fact do not send IPIs from interrupt
handlers. More for !x2APIC mode where ICR access for write requires
two registers write, we disable interrupts around it. If considered
incorrect, I can add per-cpu spinlock around ipi_send().
2. Scoreboard lines owned by given target CPU can be padded to the
cache line, to reduce ping-pong.
Reviewed by: markj (previous version)
Discussed with: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D25510
In case of errors, the cleanup was not consistent.
Thanks to Felix Weinrank for fuzzing the userland stack and making
me aware of the issue.
MFC after: 1 week