Commit Graph

387 Commits

Author SHA1 Message Date
John Baldwin
651a887f4e xen: Use devclass_find to lookup devclasses in identify.
While here, use driver->name instead of hardcoding the xenpv and
xen_et strings both for devclass_find() and BUS_ADD_CHILD().

Reviewed by:	Elliott Mitchell <ehem+freebsd@m5p.com>, imp, royger
Differential Revision:	https://reviews.freebsd.org/D35001
2022-04-25 11:55:30 -07:00
Roger Pau Monné
b93f47eaee xen/acpi: upload Cx and Px data to Xen
When FreeBSD is running as dom0 (initial domain) on a Xen system it
has access to the native ACPI tables and is the OSPM. However the
hypervisor is the entity in charge of the CPU idle and frequency
states, and in order to perform this duty it requires information
found the ACPI dynamic tables that can only be parsed by the OSPM.

Introduce a new Xen specific ACPI driver to fetch the Processor
related information and upload it to Xen. Note that this driver needs
to take precedence over the generic ACPI CPU driver when running as
dom0, so downgrade the probe score of the native driver to
BUS_PROBE_DEFAULT in order for the Xen specific driver to use
BUS_PROBE_SPECIFIC.

Tested on an Intel NUC to successfully parse and upload both the Cx and
Px states to Xen.

Sponsored by: Citrix Systems R&D
Reviewed by: jhb kib
Differential revision: https://reviews.freebsd.org/D34841
2022-04-12 10:03:26 +02:00
John Baldwin
7650d4b67e xen netback: Remove write-only variables. 2022-04-06 16:45:27 -07:00
Eric van Gyzen
aca2a7faca stack_zero is not needed before stack_save
The man page was recently clarified to commit to this contract.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2022-03-25 20:10:38 -05:00
Mateusz Guzik
bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Elliott Mitchell
ad7dd51499 xen: switch to use headers in contrib
These headers originate with the Xen project and shouldn't be mixed with
the main portion of the FreeBSD kernel. Notably they shouldn't be the
target of clean-up commits.

Switch to use the headers in sys/contrib/xen.

Reviewed by: royger
2022-02-07 10:11:56 +01:00
Roger Pau Monné
759ae58c00 xen/grant-table: remove explicit linear mapping additions
There's no need to explicitly add linear mappings for the grant table
area, as the memory is allocated using xenmem_alloc and it should
already have a linear mapping that can be obtained using
rman_get_virtual.

While there also remove the return value of gnttab_map, since there's
no return value anymore.

Sponsored by: Citrix Systems R&D
Reviewed by: Elliott Mitchell <ehem+freebsd@m5p.com>
Differential revision: https://reviews.freebsd.org/D29602
2022-02-07 10:06:27 +01:00
Roger Pau Monné
ca46f3289d xen: use an hypercall for shutdown and reboot
When running as a Xen guest it's easier to use an hypercall in order
to do power management operations (power off, power cycle). Do this
for all supported guest types (HVM and PVH). Note that for HVM the
power operation could also be done using ACPI, but there's no reason
to differentiate between PVH and HVM.

While there fix the shutdown handler to properly differentiate between
power cycle and power off requests.

Reported by: Freddy DISSAUX
MFC: 1 week
Sponsored by: Citrix Systems R&D
2022-01-13 16:54:30 +01:00
Alexander Motin
54daceab55 xen/blkfront: Remove CTLFLAG_NEEDGIANT from sysctl.
It only converts bit field into string.  It does not need locking.

MFC after:	1 week
2021-12-25 21:24:24 -05:00
Mateusz Guzik
e7236a7ddf xen: plug some of set-but-not-used vars
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-15 13:46:17 +00:00
Warner Losh
c6df6f5322 Create wrapper for Giant taken for newbus
Create a wrapper for newbus to take giant and for busses to take it too.
bus_topo_lock() should be called before interacting with newbus routines
and unlocked with bus_topo_unlock(). If you need the topology lock for
some reason, bus_topo_mtx() will provide that.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D31831
2021-12-09 17:04:45 -07:00
Elliott Mitchell
d893d9e94d xen/dev: remove write-only variable
This was found while looking for driver_filter_t functions which got the
trap frame from the argument.  This particular instance it isn't even
used, so remove now lest someone else get to it first.

Reviewed by:	mhorne
2021-11-30 17:11:57 -04:00
Gordon Bergling
e3080a9cca xen(4): Fix two typos in source code comments
- s/segement/segment/

MFC after:	3 days
2021-11-30 10:39:42 +01:00
Mateusz Guzik
7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Roger Pau Monné
50d7d967bb xen/privcmd: fix MMAP_RESOURCE ioctl to copy out results
The current definition for the MMAP_RESOURCE ioctl was wrong as it
didn't copy back the result to the caller. Fix the definition and also
remove the bogus attempt to copy the result in the implementation.

Note such copy back is only needed when querying the size of a
resource.

Sponsored by: Citrix Systems R&D
2021-11-18 09:46:44 +01:00
Elliott Mitchell
5bb67f5f3f xen/devices: purge uses of intr_machdep.h
Devices in sys/dev should be architecture-independent and NOT #include
intr_machdep.h.

Reviewed by: mhorne royger
Differential Revision: https://reviews.freebsd.org/D29959
2021-10-21 09:39:16 +02:00
Roger Pau Monné
535badd1b8 xen/pcifront: purge from tree
Xen pcifront has been unhooked from the build for a long time, as it's
only used by PV mode which FreeBSD doesn't support. Remove it from the
tree.
2021-10-21 09:39:16 +02:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Roger Pau Monné
f4c6843ec2 xen: use correct cache attributes for Xen specific memory regions
bus_activate_resource maps memory regions as uncacheable on x86, which
is more strict than required for regions allocated using xenmem_alloc,
so don't rely on bus_activate_resource and instead map the region
using pmap_mapdev_attr and VM_MEMATTR_XEN as the cache attribute.

Sponsored by: Citrix Systems R&D
2021-08-12 09:18:32 +02:00
Julien Grall
ac959cf544 xen: introduce xen_has_percpu_evtchn()
xen_vector_callback_enabled is x86 specific and availability of
per-cpu event channel delivery differs on other architectures.

Introduce a new helper to check if there's support for per-cpu event
channel injection.

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29402
2021-07-28 17:27:05 +02:00
Julien Grall
46c46edd18 xen/control: print warning on call of xctrl_suspend()
Presently suspend/resume and migration aren't supported on Xen/ARM.  As
such this shouldn't ever occur.

This likely applies to future Xen architectures (RISC-V) and
xctrl_suspend() needs dependency on intr_machdep.h fixed.

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29599
2021-07-28 17:27:05 +02:00
Elliott Mitchell
7de88bb4a2 xen/grant_table: cleanup max_nr_grant_frames()
This is no more or less than returning the smaller of two values.  Since
this is what min() does, use that to shrink max_nr_grant_frames() down
to the single line.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29840
2021-07-28 17:27:04 +02:00
Julien Grall
0b4f30c236 xen/control: introduce xen_pv_shutdown_handler()
While x86 only register PV shutdown handler for PV guests. ARM guests
are always using HVM and requires the PV shutdown handler.

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29406
2021-07-28 17:27:04 +02:00
Julien Grall
69c6eee756 xen: introduce xen_pv_disks_disabled()
ARM guest is considered as HVM in Freebsd but they only support PV disk
(no emulation available).

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29403
2021-07-28 17:27:04 +02:00
Julien Grall
5f70008327 xen/netfront: introduce xen_pv_nics_disabled()
ARM guest is considered as HVM but it only supports PV nics (no
emulation available).

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29405
2021-07-28 17:27:04 +02:00
Elliott Mitchell
e627e25d76 xen/xenpv: remove low memory limit for non-x86
For embedded devices reserved addresses will be known in advance.  More
recently added devices will also likely be correctly updated.  As a
result using any available address is reasonable on non-x86.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29304
2021-07-28 17:27:03 +02:00
Elliott Mitchell
d3705b5a7f xen/control: gate x86 specific code in the preprocessor
Commit 1522652230 was implemented strictly for x86.  Unfortunately
one of the pieces was mixed into a common area breaking other
architectures. For now disable these bits on !x86, this should be
cleaned up later.

Fixes: 1522652230 ('xen: fix dropping bitmap IPIs during resume')
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29306
2021-07-28 17:27:02 +02:00
Elliott Mitchell
b6ff9345a4 xen: create VM_MEMATTR_XEN for Xen memory mappings
The requirements for pages shared with Xen/other VMs may vary from
architecture to architecture.  As such create a macro which various
architectures can use.

Remove a use of PAT_WRITE_BACK in xenstore.c.  This is a x86-ism which
shouldn't have been present in a common area.

Original idea: Julien Grall <julien@xen.org>, 2014-01-14 06:44:08
Approach suggested by: royger
Reviewed by: royger, mhorne
Differential Revision: https://reviews.freebsd.org/D29351
2021-07-28 17:27:02 +02:00
Julien Grall
a48f7ba444 xen: move x86/xen/xenpv.c to dev/xen/bus/xenpv.c
Minor changes are necessary to make this processor-independent, but
moving the file out of x86 and into common is the first step (so
others don't add /more/ x86-isms).

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29042
2021-07-28 17:27:02 +02:00
Roger Pau Monné
ac3ede5371 x86/xen: remove PVHv1 code
PVHv1 was officially removed from Xen in 4.9, so just axe the related
code from FreeBSD.

Note FreeBSD supports PVHv2, which is the replacement for PVHv1.

Sponsored by: Citrix Systems R&D
Reviewed by: kib, Elliott Mitchell
Differential Revision: https://reviews.freebsd.org/D30228
2021-05-17 11:41:21 +02:00
Roger Pau Monné
4772e86beb xen/blkback: fix reconnection of backend
The hotplug script will be executed only once for each backend,
regardless of the frontend triggering reconnections. Fix blkback to
deal with the hotplug script being executed only once, so that
reconnections don't stall waiting for a hotplug script execution
that will never happen.

As a result of the fix move the initialization of dev_mode, dev_type
and dev_name to the watch callback, as they should be set only once
the first time the backend connects.

This fix is specially relevant for guests wanting to use UEFI OVMF
firmware, because OVMF will use Xen PV block devices and disconnect
afterwards, thus allowing them to be used by the guest OS. Without
this change the guest OS will stall waiting for the block backed to
attach.

Fixes: de0bad0001 ('blkback: add support for hotplug scripts')
MFC after: 1 week
Sponsored by: Citrix Systems R&D
2021-05-11 15:43:42 +02:00
Roger Pau Monné
4489124c04 xen/netback: do not attempt to connect in the Initialised state
Only attempt to fetch the configuration data and connect the shared
ring once the frontend has switched to the 'Connected' state. This
seems to be inline with what Linux netback does, and is required to
make newer versions of NetBSD netfront work, since NetBSD only
publishes the required configuration before switching to the Connected
state.

MFC after:	1 week
Sponsored by:	Citrix Systems R&D
2021-03-23 11:07:09 +01:00
John Baldwin
71ba16a0a0 xnb: Don't pass SIOC{ADD,DEL}MULTI to ifmedia_ioctl().
ifmedia_ioctl() doesn't handle these requests, and this matches what
xn does.

Reviewed by:	royger
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29296
2021-03-22 09:55:49 -07:00
Elliott Mitchell
a2c0e94ccf xen: remove x86-ism from Xen common code
PAT_WRITE_BACK is x86-only, whereas sys/dev/xen could be shared
between multiple architectures.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D28831
2021-03-01 13:33:01 +01:00
Roger Pau Monné
808d4aad10 xen-blkback: fix leak of grant maps on ring setup failure
Multi page rings are mapped using a single hypercall that gets passed
an array of grants to map. One of the grants in the array failing to
map would lead to the failure of the whole ring setup operation, but
there was no cleanup of the rest of the grant maps in the array that
could have likely been created as a result of the hypercall.

Add proper cleanup on the failure path during ring setup to unmap any
grants that could have been created.

This is part of XSA-361.

Sponsored by:	Citrix Systems R&D
2021-02-22 16:47:52 +01:00
Roger Pau Monné
952667da98 xen/efi: introduce a PV interface for EFI run time services for dom0
FreeBSD when running as a dom0 under Xen is not supposed to access the
run time services directly, and instead should proxy the calls through
Xen using an hypercall interface that exposes access to selected run
time services.

Implement the efirt interface on top of the Xen provided hypercalls.

Sponsored by:		Citrix Systems R&D
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D28621
2021-02-16 15:26:12 +01:00
Roger Pau Monne
a765078790 xen/privcmd: implement the restrict ioctl
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.

This allows an open privcmd instance to limit against which domains it
can act upon.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:33:27 +01:00
Roger Pau Monne
ed78016d00 xen/privcmd: implement the dm op ioctl
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.

This allows user-space to make use of the dm_op family of hypercalls,
which are used by device models.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:33:27 +01:00
Roger Pau Monne
658860e2d0 xen/privcmd: implement the map resource ioctl
The interface is mostly the same as the Linux ioctl, so that we don't
need to modify the user-space libraries that make use of it.

The ioctl is just a proxy for the XENMEM_acquire_resource hypercall.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:15:00 +01:00
Roger Pau Monné
147e593921 xen/privcmd: split setup of virtual address range into helper
Preparatory change for further additions that will also make use of
the same code. No functional change.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:14:59 +01:00
Roger Pau Monné
f713a5b37e xen/privcmd: make some integers unsigned
There's no reason for them to be signed. No functional change.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:14:59 +01:00
Roger Pau Monné
4e4e43dc9e xen: allow limiting the amount of duplicated pending xenstore watches
Xenstore watches received are queued in a list and processed in a
deferred thread. Such queuing was done without any checking, so a
guest could potentially trigger a resource starvation against the
FreeBSD kernel if such kernel is watching any user-controlled xenstore
path.

Allowing limiting the amount of pending events a watch can accumulate
to prevent a remote guest from triggering this resource starvation
issue.

For the PV device backends and frontends this limitation is only
applied to the other end /state node, which is limited to 1 pending
event, the rest of the watched paths can still have unlimited pending
watches because they are either local or controlled by a privileged
domain.

The xenstore user-space device gets special treatment as it's not
possible for the kernel to know whether the paths being watched by
user-space processes are controlled by a guest domain. For this reason
watches set by the xenstore user-space device are limited to 1000
pending events. Note this can be modified using the
max_pending_watch_events sysctl of the device.

This is XSA-349.

Sponsored by:	Citrix Systems R&D
MFC after:	3 days
2020-12-30 11:18:26 +01:00
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Konstantin Belousov
f10845877e Suspend all writeable local filesystems on power suspend.
This ensures that no writes are pending in memory, either metadata or
user data, but not including dirty pages not yet converted to fs writes.

Only filesystems declared local are suspended.

Note that this does not guarantee absence of the metadata errors or
leaks if resume is not done: for instance, on UFS unlinked but opened
inodes are leaked and require fsck to gc.

Reviewed by:	markj
Discussed with:	imp
Tested by:	imp (previous version), pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D27054
2020-11-05 20:52:49 +00:00
Konstantin Belousov
fbf2a77876 Convert allocations of the phys pager to vm_pager_allocate().
Future changes would require additional initialization of OBJT_PHYS
objects, and vm_object_allocate() is not suitable for it.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24652
2020-09-08 23:38:49 +00:00
Mateusz Guzik
6c7cae4a73 dev/xen: clean up empty lines in .c and .h files 2020-09-01 21:45:08 +00:00
Mateusz Guzik
7ad2a82da2 vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_error
Most consumers pass NULL.
2020-08-19 02:51:17 +00:00
Konstantin Belousov
4149c6a3ec Remove double-calls to tc_get_timecount() to warm timecounters.
It seems that second call does not add any useful state change for all
implemented timecounters.

Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2020-06-10 22:30:32 +00:00
Roger Pau Monné
06592d60f0 xen/control: short circuit xctrl_on_watch_event on spurious event
If there's no data to read from xenstore short-circuit
xctrl_on_watch_event to return early, there's no reason to continue
since the lack of data would prevent matching against any known event
type.

Sponsored by:	Citrix Systems R&D
MFC with:	r352925
MFC after:	1 week
2020-05-28 08:20:16 +00:00
Roger Pau Monné
3e7df58df2 xen/blkfront: use the correct type for disk sectors
The correct type to use to represent disk sectors is blkif_sector_t
(which is an uint64_t underneath). This avoid truncation of the disk
size calculation when resizing on i386, as otherwise the calculation
of d_mediasize in xbd_connect is truncated to the size of unsigned
long, which is 32bits on i386.

Note this issue didn't affect amd64, because the size of unsigned long
is 64bits there.

Sponsored by:	Citrix Systems R&D
MFC after:	1 week
2020-05-28 08:19:13 +00:00