Commit Graph

1222 Commits

Author SHA1 Message Date
Corvin Köhne
16f02a4cb4 pci: add missing PCI id of Coffee Lake GPU
The PCI id of an UHD Graphics 630 for Coffee Lake GPUs is missing in
the PCI id list of all Intel GPUs.

You can take a look at
https://dgpu-docs.intel.com/devices/hardware-table.html to check that
this device id exists.  Or check the linux code:
d0e062ebb3

MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33460
2021-12-17 23:18:31 +02:00
Mateusz Guzik
e7236a7ddf xen: plug some of set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-15 13:46:17 +00:00
Mateusz Guzik
ceed3949bc x86: plug a set-but-not-unused var in native_lapic_ipi_free
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-10 11:55:03 +00:00
Alexander Motin
8493918868 busdma: Remove outdated comments about Giant.
MFC after:	2 weeks
2021-12-09 22:18:53 -05:00
Alexander Motin
a69f810466 smist: Remove unneeded Giant from bus_dma_tag_create().
bus_dmamap_load() call uses BUS_DMA_NOWAIT.

MFC after:	2 weeks
2021-12-09 20:54:22 -05:00
John Baldwin
1a62e9bc00 Add <machine/tls.h> header to hold MD constants and helpers for TLS.
The header exports the following:

- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)

Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).

Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr.  libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).

A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.

Reviewed by:	kib (older version of x86/tls.h), jrtc27
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33351
2021-12-09 13:17:13 -08:00
Bjoern A. Zeeb
df38ada293 modules: increase MAXMODNAME and provide backward compat
With various firmware files used by graphics and wireless drivers
we are exceeding the current 32 character module name (file path
in kldxref) length.
In order to overcome this issue bump it to the maximum path length
for the next version.
To be able to MFC provide backward compat support for another version
of the struct as the offsets for the second half change due to the
array size increase.

MAXMODNAME being defined to MAXPATHLEN needs param.h to be
included first.  With only 7 modules (or LinuxKPI module.h) not
doing that adjust them rather than including param.h in module.h [1].

Reported by:	Greg V (greg unrelenting.technology)
Sponsored by:	The FreeBSD Foundation
Suggested by:	imp [1]
MFC after:	10 days
Reviewed by:	imp (and others to different level)
Differential Revision:	https://reviews.freebsd.org/D32383
2021-12-09 18:09:53 +00:00
Alexander Motin
63346fef33 mca: Some error handling logic improvements.
- Enable local MCEs on capable Intel CPUs.  It delivers exceptions
only to the affected CPU instead of global broadcast, requiring a lot
of synchronization between CPUs.  AMD always deliver MCEs locally.
 - Make MCE handler process only uncorrected errors, while CMCI and
polling only corrected.  It reduces synchronization problems between
them and is explicitly recommended by the documentation.
 - Add minimal support for uncorrected software recoverable errors
on Intel CPUs.  It allows to avoid kernel panics in case uncorrected
errors do not affect current operation, like ones found during scrub
or write.  Such errors are only logged, postponing the panic until
the corrupted data will actually be needed (that may never happen).
 - Reduce polling period from 1 hour to 5 minutes.

MFC after:	2 weeks
2021-12-08 21:39:24 -05:00
Alexander Motin
9a128e1678 mca: Switch to using taskqueue_enqueue_timeout_sbt().
Previously it was not allowed on fast taskqueues.  It was fixed in
4730a8972b.  This should make no functional change, just a bit
cleaner and efficient code.

MFC after:	1 week
2021-12-08 12:29:15 -05:00
Alexander Motin
3bdba24c74 mca: Decode new Intel status bits.
MFC after:	1 week
2021-12-08 12:03:28 -05:00
Alexander Motin
935dc0de88 mca: Remove excessively verbose debug messages.
Expecially in case of AMD there was more than dozen lines per CPU.

MFC after:	1 week
2021-12-07 22:27:09 -05:00
Alexander Motin
c2003f2684 mca: Make some sysctls also a loader tunables.
MFC after:	1 week
2021-12-07 22:22:01 -05:00
Mark Johnston
553af8f1ec x86: Perform late TSC calibration before LAPIC timer calibration
This ensures that LAPIC calibration is done using the correct tsc_freq
value, i.e., the one associated with the TSC timecounter.  It does mean
though that TSC calibration cannot use sbinuptime() to read the
reference timecounter, as timehands are not yet set up.

Reviewed by:	kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33209
2021-12-06 10:42:19 -05:00
Mark Johnston
62d09b46ad x86: Defer LAPIC calibration until after timecounters are available
This ensures that we have a good reference timecounter for performing
calibration.

Change lapic_setup to avoid configuring the timer when booting, and move
calibration and initial configuration to a new lapic routine,
lapic_calibrate_timer.  This calibration will be initiated from
cpu_initclocks(), before an eventtimer is selected.

Reviewed by:	kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33206
2021-12-06 10:42:10 -05:00
Mark Johnston
f06f1d1fdb x86: Deduplicate clock.h
The headers were mostly identical on amd64 and i386.

No functional change intended.

Reviewed by:	cperciva, mav, imp, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33205
2021-12-06 10:39:08 -05:00
Scott Long
c0ea0b4989 Fix "set but not used" in the x86 pci driver.
Sponsored by: Rubicon Communications, LLC ("Netgate")
2021-12-05 15:10:16 -07:00
Mitchell Horne
03b3d7bbec x86: remove unused T_USER flag
It stopped being used in 3c256f5395, when trap() was reorganized to
have separate switch statements for user and kernel traps. Remove the
two leftover references and the flag itself.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D33253
2021-12-05 11:12:40 -04:00
Alexander Motin
a07d426509 x86: Make AMD elvtX dump more compact.
MFC after:	2 weeks
2021-12-04 21:47:19 -05:00
Scott Long
d85a58cb0c Fix "set but not used" in busdma_bounce.
Sponsored by: Rubicon Communications, LLC ("Netgate")
2021-12-03 15:20:42 -07:00
Mitchell Horne
1adebe3cd6 minidump: Parameterize minidumpsys()
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.

This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.

Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.

The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.

Reviewed by:	kib, markj, jhb
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31989
2021-11-19 15:05:52 -04:00
Mark Johnston
22875f8879 x86: Implement deferred TSC calibration
There is no universal way to find the TSC frequency.  Newer Intel CPUs
may report it via CPUID leaves 0x15 and 0x16.  Sometimes it can be
obtained from the PLATFORM_INFO MSR as well, though we never use that.
On older platforms we derive the frequency using a DELAY(1000000) call,
which uses the 8254 PIT.  On some newer platforms the 8254 is apparently
non-functional, leading to bogus calibration results.  On such platforms
the TSC frequency must be available from CPUID.  It is also possible to
disable calibration with a tunable, in which case we try to parse the
brand string if the TSC freq is not available from CPUID.

CPUID 0x15 provides an authoritative TSC frequency value, but even that
is not always available on new Intel platforms.  CPUID 0x16 provides the
specified processor base frequency, which is not the same as the TSC
frequency.  Empirically, it is close enough for early boot, but too far
off for timekeeping: on a Comet Lake NUC, CPUID 0x16 yields 1600MHz but
the TSC frequency is rougly 1608MHz, leading to frequent clock stepping
when NTP is in use.

Thus we have a situation where we cannot calibrate using the PIT and
cannot obtain a precise frequency from CPUID (or MSRs).  This change
seeks to address that by using the CPUID 0x16 value during early boot
and refining the calibration later once ACPI-based timecounters are
available.  TSC frequency detection is thus split into two phases:

Early phase:
- On Intel platforms, query CPUID 0x15 and 0x16 and use that value
  initially if available.
- Otherwise, get an estimate using the PIT, reducing the delay loop to
  100ms from 1s.
- Continue to register the TSC as the CPU ticks provider early, even
  though the frequency may be off.  Otherwise any code executed during
  boot that uses cpu_ticks() (e.g., context switching) gets tripped up
  when the ticks provider changes.

Later phase:
- In SI_SUB_CLOCKS, once the timehands are initialized, load the current
  TSC and timecounter (sbinuptime()) values at the beginning and end of
  a 1s interval and use the timecounter frequency (typically from
  kvmclock, HPET or the ACPI PM timer) to estimate the TSC frequency.
- Update the TSC timecounter, global tsc_freq and CPU ticker with the
  new frequency and finally register the TSC as a timecounter.

Reviewed by:	kib, jhb (previous version)
Discussed with:	imp, cperciva
MFC after:	6 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32512
2021-11-15 16:13:24 -05:00
Mark Johnston
ab12e8db29 amd64: Reduce the amount of cpuset copying done for TLB shootdowns
We use pmap_invalidate_cpu_mask() to get the set of active CPUs.  This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.

Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers.  Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().

Reviewed by:	alc, kib
Tested by:	pho
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32792
2021-11-15 13:01:31 -05:00
Alexander Motin
6badb512a9 Prefer CPUID leaf 1Fh for Intel CPU topology detection.
Leaf 1Fh is a prefered extended version of 0Bh.  It is supported by
new Lader Lake CPUs, though does not report anything new so far.

MFC after:	2 weeks
2021-11-06 00:53:52 -04:00
Kyle Evans
6a8ea6d174 sched: split sched_ap_entry() out of sched_throw()
sched_throw() can no longer take a NULL thread, APs enter through
sched_ap_entry() instead.  This completely removes branching in the
common case and cleans up both paths.  No functional change intended.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D32829
2021-11-05 15:45:51 -05:00
Kyle Evans
589aed00e3 sched: separate out schedinit_ap()
schedinit_ap() sets up an AP for a later call to sched_throw(NULL).

Currently, ULE sets up some pcpu bits and fixes the idlethread lock with
a call to sched_throw(NULL); this results in a window where curthread is
setup in platforms' init_secondary(), but it has the wrong td_lock.
Typical platform AP startup procedure looks something like:

- Setup curthread
- ... other stuff, including cpu_initclocks_ap()
- Signal smp_started
- sched_throw(NULL) to enter the scheduler

cpu_initclocks_ap() may have callouts to process (e.g., nvme) and
attempt to sched_add() for this AP, but this attempt fails because
of the noted violated assumption leading to locking heartburn in
sched_setpreempt().

Interrupts are still disabled until cpu_throw() so we're not really at
risk of being preempted -- just let the scheduler in on it a little
earlier as part of setting up curthread.

Reviewed by:	alfredo, kib, markj
Triage help from:	andrew, markj
Smoke-tested by:	alfredo (ppc), kevans (arm64, x86), mhorne (arm)
Differential Revision:	https://reviews.freebsd.org/D32797
2021-11-03 15:54:59 -05:00
Kornel Duleba
06e6ca6dd3 dmar: Disable protected memory regions after initialization
Some BIOSes protect memory region they reside in by using DMAR to
prevent devices from doing any DMA transactions to that part of RAM.
AMI refers to this as "DMA Control Guarantee".
Disable the protection when address translation is enabled.
I stumbled upon this while investigation a failing coredump on a device
which has this feature enabled.

Sponsored by:		Stormshield
Obtained from:		Semihalf
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D32591
2021-10-29 10:08:25 +02:00
Kornel Duleba
3c02da8096 dmar: Don't try to reserve PCI regions for non-existing devices
In some cases we might have to create DMAR context before the
corresponding device has been enumerated by the PCI bus.
In that case we get called with NULL dev, because of that trying
to reserve PCI regions causes a NULL pointer dereference in
pci_find_pcie_root_port.

Sponsored by:		Stormshield
Obtained from:		Semihalf
MFC after:		2 weeks
Reviewed by:		kib, rlibby
Differential revision:	https://reviews.freebsd.org/D32589
2021-10-29 10:08:25 +02:00
Gleb Smirnoff
6aae3517ed Retire synchronous PPP kernel driver sppp(4).
The last two drivers that required sppp are cp(4) and ce(4).

These devices are still produced and can be purchased
at Cronyx <http://cronyx.ru/hardware/wan.html>.

Since Roman Kurakin <rik@FreeBSD.org> has quit them, they no
longer support FreeBSD officially.  Later they have dropped
support for Linux drivers to.  As of mid-2020 they don't even
have a developer to maintain their Windows driver.  However,
their support verbally told me that they could provide aid to
a FreeBSD developer with documentaion in case if there appears
a new customer for their devices.

These drivers have a feature to not use sppp(4) and create an
interface, but instead expose the device as netgraph(4) node.
Then, you can attach ng_ppp(4) with help of ports/net/mpd5 on
top of the node and get your synchronous PPP.  Alternatively
you can attach ng_frame_relay(4) or ng_cisco(4) for HDLC.
Actually, last time I used cp(4) back in 2004, using netgraph(4)
instead of sppp(4) was already the right way to do.

Thus, remove the sppp(4) related part of the drivers and enable
by default the negraph(4) part.  Further maintenance of these
drivers in the tree shouldn't be a big deal.

While doing that, remove some cruft and enable cp(4) compilation
on amd64.  The ce(4) for some unknown reason marks its internal
DDK functions with __attribute__ fastcall, which most likely is
safe to remove, but without hardware I'm not going to do that, so
ce(4) remains i386-only.

Reviewed by:		emaste, imp, donner
Differential Revision:	https://reviews.freebsd.org/D32590
See also:		https://reviews.freebsd.org/D23928
2021-10-22 11:41:36 -07:00
Konstantin Belousov
661bd70bd7 DMAR: clean up warnings about write-only variables
For some of them, used only when KTR or KMSAN are configured, apply
__unused attribute directly.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-21 21:40:46 +03:00
Mark Johnston
06ebadc5f5 x86: Remove some leftover APM support
This is obsolete since commit 8c576a279e ("Remove APM BIOS support").

Reviewed by:	imp, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32510
2021-10-18 09:56:59 -04:00
Mark Johnston
de8554295b cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

This is a re-application of commit
9068f6ea69.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-10-18 09:56:58 -04:00
Konstantin Belousov
f8d3368b43 apic: initialize lapic_paddr statically
The default value for LAPIC registers page physical address
is usually right. Having this value available early makes
pmap_force_invalidate_cache_range(), used on non-self-snoop machines,
avoid flushing LAPIC range for early calls.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32318
2021-10-06 05:52:56 +03:00
Mitchell Horne
ab4ed843a3 minidump: De-duplicate the progress bar
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31885
2021-09-29 16:42:21 -03:00
Konstantin Belousov
24a3897c2c x86 bounce_bus_dmamem_alloc(): use malloc_aligned() only when possible
malloc_domainset_aligned() requires that alignment is less than
page size. Fall back to other allocation methods, most likely
kmem_alloc_contig(), when malloc_aligned() cannot fullfill the driver
request.

Reported by:	Loic F <loic.f@hardenedbsd.org>
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32127
2021-09-25 15:58:12 +03:00
Alexander Motin
d3a8f98acb Make CPU children explicitly share parent unit numbers.
Before this device unit number match was coincidental and broke if I
disabled some CPU device(s).  Aside of cosmetics, for some drivers
(may be considered broken) it caused talking to wrong CPUs.
2021-09-24 23:31:51 -04:00
Alexander Motin
ef50d5fbc3 x86: Add NUMA nodes into CPU topology.
Depending on hardware, NUMA nodes may match last level caches, or
they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC).
This information is provided by ACPI instead of CPUID, and it is
provided for each CPU individually instead of mask widths, but
this code should be able to properly handle all the above cases.

This change should immediately allow idle stealing in sched_ule(4)
to prefer load from NUMA-local CPUs to remote ones when the node
does not match LLC.  Later we may think of how to better handle it
on sched_pickcpu() side.

MFC after:	1 month
2021-09-23 14:31:38 -04:00
Konstantin Belousov
e36d0e86e3 Revert "linux32: add a hack to avoid redefining the type of the savefpu tag"
This reverts commit 0f6829488e.
Also it changes the type of md_usr_fpu_save struct mdthread member
to void *, which is what uncovered this trouble.  Now the save area
is untyped, but since it is hidden behind accessors, it is not too
significant.  Since apparently there are consumers affected outside
the tree, this hack is better than one from the reverted revision.

PR:	258678
Reported by:	cy
Reviewed by:	cy, kevans, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32060
2021-09-22 23:17:47 +03:00
Mark Johnston
bcdc599dc2 Revert "cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it"
This reverts commit 9068f6ea69.

The underlying macro needs to be reworked to avoid problems with control
flow statements.

Reported by:	rlibby
2021-09-21 13:51:42 -04:00
Konstantin Belousov
0f6829488e linux32: add a hack to avoid redefining the type of the savefpu tag
when compiling in amd64 kernel environment with -m32.  This is a temporal
workaround for some future proper (but unclear) fix.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Mark Johnston
9068f6ea69 cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-09-21 12:07:47 -04:00
Konstantin Belousov
2b6eec531a x86: duplicate acpi_wakeup.c per i386 and amd64
The file as is is the maze of #ifdef passages, all slightly different.
Divorcing i386 and amd64 version actually makes changing the code
easier, also no changes for i386 are planned.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31931
2021-09-14 00:23:14 +03:00
Konstantin Belousov
db2ba218d9 amd64 acpi_wakeup: map 1:1 whole low 4G for the trampoline page table
This is required since kernel text might be physically located
anywhere below 4G.

PR:	258432
Reported by:	Taku YAMAMOTO <taku@tackymt.homeip.net>
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:52:13 +03:00
Konstantin Belousov
ceca8ac1ce x86 acpi_install_wakeup_handler(): style
Do not use tab between type and variable name in local declarations.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:52:06 +03:00
Konstantin Belousov
e99255c8a6 amd64: do not touch low memory in acpi_wakeup_ap() if booted by UEFI
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:51:52 +03:00
Colin Percival
cd165c8bf0 x86/tsc.c: Add TSLOG to test_tsc
On my benchmark system this takes ~ 14 ms; enough to be worth
recording in the boot time profile.
2021-09-09 17:02:15 -07:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Dimitry Andric
8e3c56d6b6 xen: Fix warning by adding KERNBASE to modlist_paddr before casting
Clang 13 produces the following warning for hammer_time_xen():

sys/x86/xen/pv.c:183:19: error: the pointer incremented by -2147483648 refers past the last possible element for an array in 64-bit address space containing 256-bit (32-byte) elements (max possible 576460752303423488 elements) [-Werror,-Warray-bounds]
                    (vm_paddr_t)start_info->modlist_paddr + KERNBASE;
                                ^                               ~~~~~~~~
sys/xen/interface/arch-x86/hvm/start_info.h:131:5: note: array 'modlist_paddr' declared here
    uint64_t modlist_paddr;         /* Physical address of an array of           */
    ^

This is because the expression first casts start_info->modlist_paddr to
struct hvm_modlist_entry * (via vmpaddr_t), and *then* adds KERNBASE,
which is then interpreted as KERNBASE * sizeof(struct
hvm_modlist_entry).

Instead, parenthesize the addition to get the intended result, and cast
it to struct hvm_modlist_entry * afterwards. Also remove the cast to
vmpaddr_t since it is not necessary.

Reviewed by:	royger
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D31711
2021-08-29 19:43:00 +02:00
Adam Fenn
d4b2d3035a pvclock: Add vDSO support
Add vDSO support for timekeeping devices that support the KVM/XEN
paravirtual clock API.

Also, expose, in the userspace-accessible '<machine/pvclock.h>',
definitions that will be needed by 'libc' to support
'VDSO_TH_ALGO_X86_PVCLK'.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31418
2021-08-14 15:57:54 +03:00
Adam Fenn
6c69c6bb4c kvm_clock: KVM paravirtual clock support
Add support for the KVM paravirtual clock device.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29733
2021-08-14 15:57:54 +03:00
Adam Fenn
0b3382b863 pvclock: Add 'struct pvclock' API
Consolidate more hypervisor-agnostic functionality behind a new 'struct
pvclock' API.

This should also make it easier to subsequently add hypervisor-agnostic
vDSO timekeeping support.

Also, perform some clean-up:
    - Remove 'pvclock_get_last_cycles()'; do not allow external access
      to 'pvclock_last_systime' since this is not necessary.
    - Consolidate/simplify wall and system time reading codepaths.
    - Ensure correct ordering within wall and system time reading
      codepaths via 'atomic(9)' and 'rdtsc_ordered()' rather than via
      'rmb()'.
    - Remove some extra newlines.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31418
2021-08-14 15:57:54 +03:00