mbox series

[RFC,0/6] virtio: Solution to restrict memory access under Xen using xen-virtio DMA ops layer

Message ID 1649963973-22879-1-git-send-email-olekstysh@gmail.com (mailing list archive)
Headers show
Series virtio: Solution to restrict memory access under Xen using xen-virtio DMA ops layer | expand

Message

Oleksandr Tyshchenko April 14, 2022, 7:19 p.m. UTC
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Hello all.

The purpose of this RFC patch series is to add support for restricting memory access under Xen using specific
grant table based DMA ops layer. Patch series is based on Juergen Gross’ initial work [1] which implies using
grant references instead of raw guest physical addresses (GPA) for the virtio communications (some kind of
the software IOMMU).

The high level idea is to create new Xen’s grant table based DMA ops layer for the guest Linux whose main
purpose is to provide a special 64-bit DMA address which is formed by using the grant reference (for a page
to be shared with the backend) with offset and setting the highest address bit (this is for the backend to
be able to distinguish grant ref based DMA address from normal GPA). For this to work we need the ability
to allocate contiguous (consecutive) grant references for multi-page allocations. And the backend then needs
to offer VIRTIO_F_ACCESS_PLATFORM and VIRTIO_F_VERSION_1 feature bits (it must support virtio-mmio modern
transport for 64-bit addresses in the virtqueue).

Xen's grant mapping mechanism is the secure and safe solution to share pages between domains which proven
to work and works for years (in the context of traditional Xen PV drivers for example). So far, the foreign
mapping is used for the virtio backend to map and access guest memory. With the foreign mapping, the backend
is able to map arbitrary pages from the guest memory (or even from Dom0 memory). And as the result, the malicious
backend which runs in a non-trusted domain can take advantage of this. Instead, with the grant mapping
the backend is only allowed to map pages which were explicitly granted by the guest before and nothing else. 
According to the discussions in various mainline threads this solution would likely be welcome because it
perfectly fits in the security model Xen provides. 

What is more, the grant table based solution requires zero changes to the Xen hypervisor itself at least
with virtio-mmio and DT (in comparison, for example, with "foreign mapping + virtio-iommu" solution which would
require the whole new complex emulator in hypervisor in addition to new functionality/hypercall to pass IOVA
from the virtio backend running elsewhere to the hypervisor and translate it to the GPA before mapping into
P2M or denying the foreign mapping request if no corresponding IOVA-GPA mapping present in the IOMMU page table
for that particular device). We only need to update toolstack to insert a new "xen,dev-domid" property to
the virtio-mmio device node when creating a guest device-tree (this is an indicator for the guest to use grants
and the ID of Xen domain where the corresponding backend resides, it is used as an argument to the grant mapping
APIs). It worth mentioning that toolstack patch is based on non  upstreamed yet “Virtio support for toolstack
on Arm” series which is on review now [2].

Please note the following:
- Patch series only covers Arm and virtio-mmio (device-tree) for now. To enable the restricted memory access
  feature on Arm the following options should be set:
  CONFIG_XEN_VIRTIO = y
  CONFIG_XEN_HVM_VIRTIO_GRANT = y
- Some callbacks in xen-virtio DMA ops layer (map_sg/unmap_sg, etc) are not implemented yet as they are not
  needed/used in the first prototype

Patch series is rebased on Linux 5.18-rc2 tag and tested on Renesas Salvator-X board + H3 ES3.0 SoC (Arm64)
with standalone userspace (non-Qemu) virtio-mmio based virtio-disk backend running in Driver domain and Linux
guest running on existing virtio-blk driver (frontend). No issues were observed. Guest domain 'reboot/destroy'
use-cases work properly. I have also tested other use-cases such as assigning several virtio block devices
or a mix of virtio and Xen PV block devices to the guest. 

1. Xen changes located at (last patch):
https://github.com/otyshchenko1/xen/commits/libxl_virtio_next
2. Linux changes located at:
https://github.com/otyshchenko1/linux/commits/virtio_grant5
3. virtio-disk changes located at:
https://github.com/otyshchenko1/virtio-disk/commits/virtio_grant

Any feedback/help would be highly appreciated.

[1] https://www.youtube.com/watch?v=IrlEdaIUDPk
[2] https://lore.kernel.org/xen-devel/1649442065-8332-1-git-send-email-olekstysh@gmail.com/

Juergen Gross (2):
  xen/grants: support allocating consecutive grants
  virtio: add option to restrict memory access under Xen

Oleksandr Tyshchenko (4):
  dt-bindings: xen: Add xen,dev-domid property description for
    xen-virtio layer
  virtio: Various updates to xen-virtio DMA ops layer
  arm/xen: Introduce xen_setup_dma_ops()
  arm/xen: Assign xen-virtio DMA ops for virtio devices in Xen guests

 .../devicetree/bindings/virtio/xen,dev-domid.yaml  |  39 +++
 arch/arm/include/asm/xen/xen-ops.h                 |   1 +
 arch/arm/mm/dma-mapping.c                          |   5 +-
 arch/arm/xen/enlighten.c                           |  11 +
 arch/arm64/include/asm/xen/xen-ops.h               |   1 +
 arch/arm64/mm/dma-mapping.c                        |   5 +-
 arch/x86/mm/init.c                                 |  15 +
 arch/x86/mm/mem_encrypt.c                          |   5 -
 arch/x86/xen/Kconfig                               |   9 +
 drivers/xen/Kconfig                                |  20 ++
 drivers/xen/Makefile                               |   1 +
 drivers/xen/grant-table.c                          | 238 +++++++++++++--
 drivers/xen/xen-virtio.c                           | 335 +++++++++++++++++++++
 include/xen/arm/xen-ops.h                          |  20 ++
 include/xen/grant_table.h                          |   4 +
 include/xen/xen-ops.h                              |  13 +
 16 files changed, 679 insertions(+), 43 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/virtio/xen,dev-domid.yaml
 create mode 100644 arch/arm/include/asm/xen/xen-ops.h
 create mode 100644 arch/arm64/include/asm/xen/xen-ops.h
 create mode 100644 drivers/xen/xen-virtio.c
 create mode 100644 include/xen/arm/xen-ops.h

Comments

Christoph Hellwig April 15, 2022, 7:41 a.m. UTC | #1
I can only see three out of 6 patches on the linux-arm-kernel list,
which makes reviewing this impossible.  Also please Cc me directly
on any series doing crazy things with dma ops.  Thanks!
Michael S. Tsirkin April 15, 2022, 8:44 a.m. UTC | #2
On Thu, Apr 14, 2022 at 10:19:27PM +0300, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Hello all.
> 
> The purpose of this RFC patch series is to add support for restricting memory access under Xen using specific
> grant table based DMA ops layer. Patch series is based on Juergen Gross’ initial work [1] which implies using
> grant references instead of raw guest physical addresses (GPA) for the virtio communications (some kind of
> the software IOMMU).
> 
> The high level idea is to create new Xen’s grant table based DMA ops layer for the guest Linux whose main
> purpose is to provide a special 64-bit DMA address which is formed by using the grant reference (for a page
> to be shared with the backend) with offset and setting the highest address bit (this is for the backend to
> be able to distinguish grant ref based DMA address from normal GPA). For this to work we need the ability
> to allocate contiguous (consecutive) grant references for multi-page allocations. And the backend then needs
> to offer VIRTIO_F_ACCESS_PLATFORM and VIRTIO_F_VERSION_1 feature bits (it must support virtio-mmio modern
> transport for 64-bit addresses in the virtqueue).

I'm not enough of a xen expert to review this, and I didn't get
all patches, but I'm very happy to see that approach being
taken. VIRTIO_F_ACCESS_PLATFORM and VIRTIO_F_VERSION_1 are
exactly the way to declare not all of memory is accessible.
Thanks!

> Xen's grant mapping mechanism is the secure and safe solution to share pages between domains which proven
> to work and works for years (in the context of traditional Xen PV drivers for example). So far, the foreign
> mapping is used for the virtio backend to map and access guest memory. With the foreign mapping, the backend
> is able to map arbitrary pages from the guest memory (or even from Dom0 memory). And as the result, the malicious
> backend which runs in a non-trusted domain can take advantage of this. Instead, with the grant mapping
> the backend is only allowed to map pages which were explicitly granted by the guest before and nothing else. 
> According to the discussions in various mainline threads this solution would likely be welcome because it
> perfectly fits in the security model Xen provides. 
> 
> What is more, the grant table based solution requires zero changes to the Xen hypervisor itself at least
> with virtio-mmio and DT (in comparison, for example, with "foreign mapping + virtio-iommu" solution which would
> require the whole new complex emulator in hypervisor in addition to new functionality/hypercall to pass IOVA
> from the virtio backend running elsewhere to the hypervisor and translate it to the GPA before mapping into
> P2M or denying the foreign mapping request if no corresponding IOVA-GPA mapping present in the IOMMU page table
> for that particular device). We only need to update toolstack to insert a new "xen,dev-domid" property to
> the virtio-mmio device node when creating a guest device-tree (this is an indicator for the guest to use grants
> and the ID of Xen domain where the corresponding backend resides, it is used as an argument to the grant mapping
> APIs). It worth mentioning that toolstack patch is based on non  upstreamed yet “Virtio support for toolstack
> on Arm” series which is on review now [2].
> 
> Please note the following:
> - Patch series only covers Arm and virtio-mmio (device-tree) for now. To enable the restricted memory access
>   feature on Arm the following options should be set:
>   CONFIG_XEN_VIRTIO = y
>   CONFIG_XEN_HVM_VIRTIO_GRANT = y
> - Some callbacks in xen-virtio DMA ops layer (map_sg/unmap_sg, etc) are not implemented yet as they are not
>   needed/used in the first prototype
> 
> Patch series is rebased on Linux 5.18-rc2 tag and tested on Renesas Salvator-X board + H3 ES3.0 SoC (Arm64)
> with standalone userspace (non-Qemu) virtio-mmio based virtio-disk backend running in Driver domain and Linux
> guest running on existing virtio-blk driver (frontend). No issues were observed. Guest domain 'reboot/destroy'
> use-cases work properly. I have also tested other use-cases such as assigning several virtio block devices
> or a mix of virtio and Xen PV block devices to the guest. 
> 
> 1. Xen changes located at (last patch):
> https://github.com/otyshchenko1/xen/commits/libxl_virtio_next
> 2. Linux changes located at:
> https://github.com/otyshchenko1/linux/commits/virtio_grant5
> 3. virtio-disk changes located at:
> https://github.com/otyshchenko1/virtio-disk/commits/virtio_grant
> 
> Any feedback/help would be highly appreciated.
> 
> [1] https://www.youtube.com/watch?v=IrlEdaIUDPk
> [2] https://lore.kernel.org/xen-devel/1649442065-8332-1-git-send-email-olekstysh@gmail.com/
> 
> Juergen Gross (2):
>   xen/grants: support allocating consecutive grants
>   virtio: add option to restrict memory access under Xen
> 
> Oleksandr Tyshchenko (4):
>   dt-bindings: xen: Add xen,dev-domid property description for
>     xen-virtio layer
>   virtio: Various updates to xen-virtio DMA ops layer
>   arm/xen: Introduce xen_setup_dma_ops()
>   arm/xen: Assign xen-virtio DMA ops for virtio devices in Xen guests
> 
>  .../devicetree/bindings/virtio/xen,dev-domid.yaml  |  39 +++
>  arch/arm/include/asm/xen/xen-ops.h                 |   1 +
>  arch/arm/mm/dma-mapping.c                          |   5 +-
>  arch/arm/xen/enlighten.c                           |  11 +
>  arch/arm64/include/asm/xen/xen-ops.h               |   1 +
>  arch/arm64/mm/dma-mapping.c                        |   5 +-
>  arch/x86/mm/init.c                                 |  15 +
>  arch/x86/mm/mem_encrypt.c                          |   5 -
>  arch/x86/xen/Kconfig                               |   9 +
>  drivers/xen/Kconfig                                |  20 ++
>  drivers/xen/Makefile                               |   1 +
>  drivers/xen/grant-table.c                          | 238 +++++++++++++--
>  drivers/xen/xen-virtio.c                           | 335 +++++++++++++++++++++
>  include/xen/arm/xen-ops.h                          |  20 ++
>  include/xen/grant_table.h                          |   4 +
>  include/xen/xen-ops.h                              |  13 +
>  16 files changed, 679 insertions(+), 43 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/virtio/xen,dev-domid.yaml
>  create mode 100644 arch/arm/include/asm/xen/xen-ops.h
>  create mode 100644 arch/arm64/include/asm/xen/xen-ops.h
>  create mode 100644 drivers/xen/xen-virtio.c
>  create mode 100644 include/xen/arm/xen-ops.h
> 
> -- 
> 2.7.4
Oleksandr Tyshchenko April 15, 2022, 10:04 a.m. UTC | #3
On 15.04.22 10:41, Christoph Hellwig wrote:

Hello Christoph

> I can only see three out of 6 patches on the linux-arm-kernel list,
> which makes reviewing this impossible.


Oops, I will add linux-arm-kernel. I blindly followed what 
get_maintainer.pl suggested for each patch plus added manually some Xen 
folks,
but, indeed, the first three patches add the base of this enabling work.


> Also please Cc me directly
> on any series doing crazy things with dma ops.  Thanks!

yes, sure.
Oleksandr Tyshchenko April 15, 2022, 3:20 p.m. UTC | #4
On 14.04.22 22:43, H. Peter Anvin wrote:

Hello Peter


> On April 14, 2022 12:19:29 PM PDT, Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
>> From: Juergen Gross <jgross@suse.com>
>>
>> In order to support virtio in Xen guests add a config option enabling
>> the user to specify whether in all Xen guests virtio should be able to
>> access memory via Xen grant mappings only on the host side.
>>
>> This applies to fully virtualized guests only, as for paravirtualized
>> guests this is mandatory.
>>
>> This requires to switch arch_has_restricted_virtio_memory_access()
> >from a pure stub to a real function on x86 systems (Arm systems are
>> not covered by now).
>>
>> Add the needed functionality by providing a special set of DMA ops
>> handling the needed grant operations for the I/O pages.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> arch/x86/mm/init.c        |  15 ++++
>> arch/x86/mm/mem_encrypt.c |   5 --
>> arch/x86/xen/Kconfig      |   9 +++
>> drivers/xen/Kconfig       |  20 ++++++
>> drivers/xen/Makefile      |   1 +
>> drivers/xen/xen-virtio.c  | 177 ++++++++++++++++++++++++++++++++++++++++++++++
>> include/xen/xen-ops.h     |   8 +++
>> 7 files changed, 230 insertions(+), 5 deletions(-)
>> create mode 100644 drivers/xen/xen-virtio.c
>>
>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>> index d8cfce2..526a3b2 100644
>> --- a/arch/x86/mm/init.c
>> +++ b/arch/x86/mm/init.c
>> @@ -8,6 +8,8 @@
>> #include <linux/kmemleak.h>
>> #include <linux/sched/task.h>
>>
>> +#include <xen/xen.h>
>> +
>> #include <asm/set_memory.h>
>> #include <asm/e820/api.h>
>> #include <asm/init.h>
>> @@ -1065,3 +1067,16 @@ unsigned long max_swapfile_size(void)
>> 	return pages;
>> }
>> #endif
>> +
>> +#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>> +int arch_has_restricted_virtio_memory_access(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_XEN_PV_VIRTIO) && xen_pv_domain())
>> +		return 1;
>> +	if (IS_ENABLED(CONFIG_XEN_HVM_VIRTIO_GRANT) && xen_hvm_domain())
>> +		return 1;
>> +
>> +	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>> +}
>> +EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>> +#endif
>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>> index 50d2099..dda020f 100644
>> --- a/arch/x86/mm/mem_encrypt.c
>> +++ b/arch/x86/mm/mem_encrypt.c
>> @@ -77,8 +77,3 @@ void __init mem_encrypt_init(void)
>> 	print_mem_encrypt_feature_info();
>> }
>>
>> -int arch_has_restricted_virtio_memory_access(void)
>> -{
>> -	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>> -}
>> -EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>> diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
>> index 85246dd..dffdffd 100644
>> --- a/arch/x86/xen/Kconfig
>> +++ b/arch/x86/xen/Kconfig
>> @@ -92,3 +92,12 @@ config XEN_DOM0
>> 	select X86_X2APIC if XEN_PVH && X86_64
>> 	help
>> 	  Support running as a Xen Dom0 guest.
>> +
>> +config XEN_PV_VIRTIO
>> +	bool "Xen virtio support for PV guests"
>> +	depends on XEN_VIRTIO && XEN_PV
>> +	default y
>> +	help
>> +	  Support virtio for running as a paravirtualized guest. This will
>> +	  need support on the backend side (qemu or kernel, depending on the
>> +	  virtio device types used).
>> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
>> index 120d32f..fc61f7a 100644
>> --- a/drivers/xen/Kconfig
>> +++ b/drivers/xen/Kconfig
>> @@ -335,4 +335,24 @@ config XEN_UNPOPULATED_ALLOC
>> 	  having to balloon out RAM regions in order to obtain physical memory
>> 	  space to create such mappings.
>>
>> +config XEN_VIRTIO
>> +	bool "Xen virtio support"
>> +	default n
>> +	depends on VIRTIO && DMA_OPS
>> +	select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>> +	help
>> +	  Enable virtio support for running as Xen guest. Depending on the
>> +	  guest type this will require special support on the backend side
>> +	  (qemu or kernel, depending on the virtio device types used).
>> +
>> +config XEN_HVM_VIRTIO_GRANT
>> +	bool "Require virtio for fully virtualized guests to use grant mappings"
>> +	depends on XEN_VIRTIO && X86_64
>> +	default y
>> +	help
>> +	  Require virtio for fully virtualized guests to use grant mappings.
>> +	  This will avoid the need to give the backend the right to map all
>> +	  of the guest memory. This will need support on the backend side
>> +	  (qemu or kernel, depending on the virtio device types used).
>> +
>> endmenu
>> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
>> index 5aae66e..767009c 100644
>> --- a/drivers/xen/Makefile
>> +++ b/drivers/xen/Makefile
>> @@ -39,3 +39,4 @@ xen-gntalloc-y				:= gntalloc.o
>> xen-privcmd-y				:= privcmd.o privcmd-buf.o
>> obj-$(CONFIG_XEN_FRONT_PGDIR_SHBUF)	+= xen-front-pgdir-shbuf.o
>> obj-$(CONFIG_XEN_UNPOPULATED_ALLOC)	+= unpopulated-alloc.o
>> +obj-$(CONFIG_XEN_VIRTIO)		+= xen-virtio.o
>> diff --git a/drivers/xen/xen-virtio.c b/drivers/xen/xen-virtio.c
>> new file mode 100644
>> index 00000000..cfd5eda
>> --- /dev/null
>> +++ b/drivers/xen/xen-virtio.c
>> @@ -0,0 +1,177 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/******************************************************************************
>> + * Xen virtio driver - enables using virtio devices in Xen guests.
>> + *
>> + * Copyright (c) 2021, Juergen Gross <jgross@suse.com>
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/dma-map-ops.h>
>> +#include <linux/pci.h>
>> +#include <linux/pfn.h>
>> +#include <linux/virtio_config.h>
>> +#include <xen/xen.h>
>> +#include <xen/grant_table.h>
>> +
>> +#define XEN_GRANT_ADDR_OFF	0x8000000000000000ULL
>> +
>> +static inline dma_addr_t grant_to_dma(grant_ref_t grant)
>> +{
>> +	return XEN_GRANT_ADDR_OFF | ((dma_addr_t)grant << PAGE_SHIFT);
>> +}
>> +
>> +static inline grant_ref_t dma_to_grant(dma_addr_t dma)
>> +{
>> +	return (grant_ref_t)((dma & ~XEN_GRANT_ADDR_OFF) >> PAGE_SHIFT);
>> +}
>> +
>> +/*
>> + * DMA ops for Xen virtio frontends.
>> + *
>> + * Used to act as a kind of software IOMMU for Xen guests by using grants as
>> + * DMA addresses.
>> + * Such a DMA address is formed by using the grant reference as a frame
>> + * number and setting the highest address bit (this bit is for the backend
>> + * to be able to distinguish it from e.g. a mmio address).
>> + *
>> + * Note that for now we hard wire dom0 to be the backend domain. In order to
>> + * support any domain as backend we'd need to add a way to communicate the
>> + * domid of this backend, e.g. via Xenstore or via the PCI-device's config
>> + * space.
>> + */
>> +static void *xen_virtio_dma_alloc(struct device *dev, size_t size,
>> +				  dma_addr_t *dma_handle, gfp_t gfp,
>> +				  unsigned long attrs)
>> +{
>> +	unsigned int n_pages = PFN_UP(size);
>> +	unsigned int i;
>> +	unsigned long pfn;
>> +	grant_ref_t grant;
>> +	void *ret;
>> +
>> +	ret = (void *)__get_free_pages(gfp, get_order(size));
>> +	if (!ret)
>> +		return NULL;
>> +
>> +	pfn = virt_to_pfn(ret);
>> +
>> +	if (gnttab_alloc_grant_reference_seq(n_pages, &grant)) {
>> +		free_pages((unsigned long)ret, get_order(size));
>> +		return NULL;
>> +	}
>> +
>> +	for (i = 0; i < n_pages; i++) {
>> +		gnttab_grant_foreign_access_ref(grant + i, 0,
>> +						pfn_to_gfn(pfn + i), 0);
>> +	}
>> +
>> +	*dma_handle = grant_to_dma(grant);
>> +
>> +	return ret;
>> +}
>> +
>> +static void xen_virtio_dma_free(struct device *dev, size_t size, void *vaddr,
>> +				dma_addr_t dma_handle, unsigned long attrs)
>> +{
>> +	unsigned int n_pages = PFN_UP(size);
>> +	unsigned int i;
>> +	grant_ref_t grant;
>> +
>> +	grant = dma_to_grant(dma_handle);
>> +
>> +	for (i = 0; i < n_pages; i++)
>> +		gnttab_end_foreign_access_ref(grant + i);
>> +
>> +	gnttab_free_grant_reference_seq(grant, n_pages);
>> +
>> +	free_pages((unsigned long)vaddr, get_order(size));
>> +}
>> +
>> +static struct page *xen_virtio_dma_alloc_pages(struct device *dev, size_t size,
>> +					       dma_addr_t *dma_handle,
>> +					       enum dma_data_direction dir,
>> +					       gfp_t gfp)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_alloc_pages size %ld\n", size);
>> +	return NULL;
>> +}
>> +
>> +static void xen_virtio_dma_free_pages(struct device *dev, size_t size,
>> +				      struct page *vaddr, dma_addr_t dma_handle,
>> +				      enum dma_data_direction dir)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_free_pages size %ld\n", size);
>> +}
>> +
>> +static dma_addr_t xen_virtio_dma_map_page(struct device *dev, struct page *page,
>> +					  unsigned long offset, size_t size,
>> +					  enum dma_data_direction dir,
>> +					  unsigned long attrs)
>> +{
>> +	grant_ref_t grant;
>> +
>> +	if (gnttab_alloc_grant_references(1, &grant))
>> +		return 0;
>> +
>> +	gnttab_grant_foreign_access_ref(grant, 0, xen_page_to_gfn(page),
>> +					dir == DMA_TO_DEVICE);
>> +
>> +	return grant_to_dma(grant) + offset;
>> +}
>> +
>> +static void xen_virtio_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
>> +				      size_t size, enum dma_data_direction dir,
>> +				      unsigned long attrs)
>> +{
>> +	grant_ref_t grant;
>> +
>> +	grant = dma_to_grant(dma_handle);
>> +
>> +	gnttab_end_foreign_access_ref(grant);
>> +
>> +	gnttab_free_grant_reference(grant);
>> +}
>> +
>> +static int xen_virtio_dma_map_sg(struct device *dev, struct scatterlist *sg,
>> +				 int nents, enum dma_data_direction dir,
>> +				 unsigned long attrs)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_map_sg nents %d\n", nents);
>> +	return -EINVAL;
>> +}
>> +
>> +static void xen_virtio_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
>> +				    int nents, enum dma_data_direction dir,
>> +				    unsigned long attrs)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_unmap_sg nents %d\n", nents);
>> +}
>> +
>> +static int xen_virtio_dma_dma_supported(struct device *dev, u64 mask)
>> +{
>> +	return 1;
>> +}
>> +
>> +static const struct dma_map_ops xen_virtio_dma_ops = {
>> +	.alloc = xen_virtio_dma_alloc,
>> +	.free = xen_virtio_dma_free,
>> +	.alloc_pages = xen_virtio_dma_alloc_pages,
>> +	.free_pages = xen_virtio_dma_free_pages,
>> +	.mmap = dma_common_mmap,
>> +	.get_sgtable = dma_common_get_sgtable,
>> +	.map_page = xen_virtio_dma_map_page,
>> +	.unmap_page = xen_virtio_dma_unmap_page,
>> +	.map_sg = xen_virtio_dma_map_sg,
>> +	.unmap_sg = xen_virtio_dma_unmap_sg,
>> +	.dma_supported = xen_virtio_dma_dma_supported,
>> +};
>> +
>> +void xen_virtio_setup_dma_ops(struct device *dev)
>> +{
>> +	dev->dma_ops = &xen_virtio_dma_ops;
>> +}
>> +EXPORT_SYMBOL_GPL(xen_virtio_setup_dma_ops);
>> +
>> +MODULE_DESCRIPTION("Xen virtio support driver");
>> +MODULE_AUTHOR("Juergen Gross <jgross@suse.com>");
>> +MODULE_LICENSE("GPL");
>> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
>> index a3584a3..ae3c1bc 100644
>> --- a/include/xen/xen-ops.h
>> +++ b/include/xen/xen-ops.h
>> @@ -221,4 +221,12 @@ static inline void xen_preemptible_hcall_end(void) { }
>>
>> #endif /* CONFIG_XEN_PV && !CONFIG_PREEMPTION */
>>
>> +#ifdef CONFIG_XEN_VIRTIO
>> +void xen_virtio_setup_dma_ops(struct device *dev);
>> +#else
>> +static inline void xen_virtio_setup_dma_ops(struct device *dev)
>> +{
>> +}
>> +#endif /* CONFIG_XEN_VIRTIO */
>> +
>> #endif /* INCLUDE_XEN_OPS_H */
> Can you please encapsulate the Xen part of the test in some Xen-specific file?

I assume your question is about changes to common arch/x86/mm/init.c?


#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
int arch_has_restricted_virtio_memory_access(void)
{
     if (IS_ENABLED(CONFIG_XEN_PV_VIRTIO) && xen_pv_domain())
         return 1;
     if (IS_ENABLED(CONFIG_XEN_HVM_VIRTIO_GRANT) && xen_hvm_domain())
         return 1;

     return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
}
EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);


If we are speaking about the whole function, then I am not sure, 
unfortunately. I think, if this was a purely Xen specific function 
(which was used by Xen guests only) I would move it somewhere to 
arch/x86/xen/...   (probably to arch/x86/xen/enlighten.c).

As I understand (please bear in mind I am not too familiar with x86 
code), so far this function was only used by SEV guests, but with 
current changes it is going to be used by both Xen and SEV guests, so it 
should be available even if Xen support is compiled out. Could you 
please suggest a better place to keep this stuff?


If we are speaking about only Xen specific bits in that function, then 
definitely yes, for example, in this way:

1. arch/x86/mm/init.c or other common (non-Xen specific) location:

#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
int arch_has_restricted_virtio_memory_access(void)
{
     return (xen_has_restricted_virtio_memory_access() ||
             cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
}
EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
#endif

2.  include/xen/xen.h or other Xen specific location:

static inline int xen_has_restricted_virtio_memory_access(void)
{
     if (IS_ENABLED(CONFIG_XEN_PV_VIRTIO) && xen_pv_domain())
         return 1;
     if (IS_ENABLED(CONFIG_XEN_HVM_VIRTIO_GRANT) && xen_hvm_domain())
         return 1;

     return 0;
}

Or I misunderstood your question?
Oleksandr Tyshchenko April 15, 2022, 3:29 p.m. UTC | #5
On 15.04.22 11:44, Michael S. Tsirkin wrote:


Hello Michael



> On Thu, Apr 14, 2022 at 10:19:27PM +0300, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> Hello all.
>>
>> The purpose of this RFC patch series is to add support for restricting memory access under Xen using specific
>> grant table based DMA ops layer. Patch series is based on Juergen Gross’ initial work [1] which implies using
>> grant references instead of raw guest physical addresses (GPA) for the virtio communications (some kind of
>> the software IOMMU).
>>
>> The high level idea is to create new Xen’s grant table based DMA ops layer for the guest Linux whose main
>> purpose is to provide a special 64-bit DMA address which is formed by using the grant reference (for a page
>> to be shared with the backend) with offset and setting the highest address bit (this is for the backend to
>> be able to distinguish grant ref based DMA address from normal GPA). For this to work we need the ability
>> to allocate contiguous (consecutive) grant references for multi-page allocations. And the backend then needs
>> to offer VIRTIO_F_ACCESS_PLATFORM and VIRTIO_F_VERSION_1 feature bits (it must support virtio-mmio modern
>> transport for 64-bit addresses in the virtqueue).
> I'm not enough of a xen expert to review this, and I didn't get
> all patches, but I'm very happy to see that approach being
> taken. VIRTIO_F_ACCESS_PLATFORM and VIRTIO_F_VERSION_1 are
> exactly the way to declare not all of memory is accessible.
> Thanks!

I am happy to hear that! Thank you.


Regarding the "all patches" I have already redirect missing ones, I hope 
you and Christoph will get them.

Sorry for the inconvenience.


>
>> Xen's grant mapping mechanism is the secure and safe solution to share pages between domains which proven
>> to work and works for years (in the context of traditional Xen PV drivers for example). So far, the foreign
>> mapping is used for the virtio backend to map and access guest memory. With the foreign mapping, the backend
>> is able to map arbitrary pages from the guest memory (or even from Dom0 memory). And as the result, the malicious
>> backend which runs in a non-trusted domain can take advantage of this. Instead, with the grant mapping
>> the backend is only allowed to map pages which were explicitly granted by the guest before and nothing else.
>> According to the discussions in various mainline threads this solution would likely be welcome because it
>> perfectly fits in the security model Xen provides.
>>
>> What is more, the grant table based solution requires zero changes to the Xen hypervisor itself at least
>> with virtio-mmio and DT (in comparison, for example, with "foreign mapping + virtio-iommu" solution which would
>> require the whole new complex emulator in hypervisor in addition to new functionality/hypercall to pass IOVA
>> from the virtio backend running elsewhere to the hypervisor and translate it to the GPA before mapping into
>> P2M or denying the foreign mapping request if no corresponding IOVA-GPA mapping present in the IOMMU page table
>> for that particular device). We only need to update toolstack to insert a new "xen,dev-domid" property to
>> the virtio-mmio device node when creating a guest device-tree (this is an indicator for the guest to use grants
>> and the ID of Xen domain where the corresponding backend resides, it is used as an argument to the grant mapping
>> APIs). It worth mentioning that toolstack patch is based on non  upstreamed yet “Virtio support for toolstack
>> on Arm” series which is on review now [2].
>>
>> Please note the following:
>> - Patch series only covers Arm and virtio-mmio (device-tree) for now. To enable the restricted memory access
>>    feature on Arm the following options should be set:
>>    CONFIG_XEN_VIRTIO = y
>>    CONFIG_XEN_HVM_VIRTIO_GRANT = y
>> - Some callbacks in xen-virtio DMA ops layer (map_sg/unmap_sg, etc) are not implemented yet as they are not
>>    needed/used in the first prototype
>>
>> Patch series is rebased on Linux 5.18-rc2 tag and tested on Renesas Salvator-X board + H3 ES3.0 SoC (Arm64)
>> with standalone userspace (non-Qemu) virtio-mmio based virtio-disk backend running in Driver domain and Linux
>> guest running on existing virtio-blk driver (frontend). No issues were observed. Guest domain 'reboot/destroy'
>> use-cases work properly. I have also tested other use-cases such as assigning several virtio block devices
>> or a mix of virtio and Xen PV block devices to the guest.
>>
>> 1. Xen changes located at (last patch):
>> https://github.com/otyshchenko1/xen/commits/libxl_virtio_next
>> 2. Linux changes located at:
>> https://github.com/otyshchenko1/linux/commits/virtio_grant5
>> 3. virtio-disk changes located at:
>> https://github.com/otyshchenko1/virtio-disk/commits/virtio_grant
>>
>> Any feedback/help would be highly appreciated.
>>
>> [1] https://www.youtube.com/watch?v=IrlEdaIUDPk
>> [2] https://lore.kernel.org/xen-devel/1649442065-8332-1-git-send-email-olekstysh@gmail.com/
>>
>> Juergen Gross (2):
>>    xen/grants: support allocating consecutive grants
>>    virtio: add option to restrict memory access under Xen
>>
>> Oleksandr Tyshchenko (4):
>>    dt-bindings: xen: Add xen,dev-domid property description for
>>      xen-virtio layer
>>    virtio: Various updates to xen-virtio DMA ops layer
>>    arm/xen: Introduce xen_setup_dma_ops()
>>    arm/xen: Assign xen-virtio DMA ops for virtio devices in Xen guests
>>
>>   .../devicetree/bindings/virtio/xen,dev-domid.yaml  |  39 +++
>>   arch/arm/include/asm/xen/xen-ops.h                 |   1 +
>>   arch/arm/mm/dma-mapping.c                          |   5 +-
>>   arch/arm/xen/enlighten.c                           |  11 +
>>   arch/arm64/include/asm/xen/xen-ops.h               |   1 +
>>   arch/arm64/mm/dma-mapping.c                        |   5 +-
>>   arch/x86/mm/init.c                                 |  15 +
>>   arch/x86/mm/mem_encrypt.c                          |   5 -
>>   arch/x86/xen/Kconfig                               |   9 +
>>   drivers/xen/Kconfig                                |  20 ++
>>   drivers/xen/Makefile                               |   1 +
>>   drivers/xen/grant-table.c                          | 238 +++++++++++++--
>>   drivers/xen/xen-virtio.c                           | 335 +++++++++++++++++++++
>>   include/xen/arm/xen-ops.h                          |  20 ++
>>   include/xen/grant_table.h                          |   4 +
>>   include/xen/xen-ops.h                              |  13 +
>>   16 files changed, 679 insertions(+), 43 deletions(-)
>>   create mode 100644 Documentation/devicetree/bindings/virtio/xen,dev-domid.yaml
>>   create mode 100644 arch/arm/include/asm/xen/xen-ops.h
>>   create mode 100644 arch/arm64/include/asm/xen/xen-ops.h
>>   create mode 100644 drivers/xen/xen-virtio.c
>>   create mode 100644 include/xen/arm/xen-ops.h
>>
>> -- 
>> 2.7.4
Oleksandr Tyshchenko April 17, 2022, 5:02 p.m. UTC | #6
On 16.04.22 01:01, Stefano Stabellini wrote:


Hello Stefano


> On Thu, 14 Apr 2022, Oleksandr Tyshchenko wrote:
>> From: Juergen Gross <jgross@suse.com>
>>
>> In order to support virtio in Xen guests add a config option enabling
>> the user to specify whether in all Xen guests virtio should be able to
>> access memory via Xen grant mappings only on the host side.
>>
>> This applies to fully virtualized guests only, as for paravirtualized
>> guests this is mandatory.
>>
>> This requires to switch arch_has_restricted_virtio_memory_access()
>> from a pure stub to a real function on x86 systems (Arm systems are
>> not covered by now).
>>
>> Add the needed functionality by providing a special set of DMA ops
>> handling the needed grant operations for the I/O pages.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   arch/x86/mm/init.c        |  15 ++++
>>   arch/x86/mm/mem_encrypt.c |   5 --
>>   arch/x86/xen/Kconfig      |   9 +++
>>   drivers/xen/Kconfig       |  20 ++++++
>>   drivers/xen/Makefile      |   1 +
>>   drivers/xen/xen-virtio.c  | 177 ++++++++++++++++++++++++++++++++++++++++++++++
>>   include/xen/xen-ops.h     |   8 +++
>>   7 files changed, 230 insertions(+), 5 deletions(-)
>>   create mode 100644 drivers/xen/xen-virtio.c
>>
>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>> index d8cfce2..526a3b2 100644
>> --- a/arch/x86/mm/init.c
>> +++ b/arch/x86/mm/init.c
>> @@ -8,6 +8,8 @@
>>   #include <linux/kmemleak.h>
>>   #include <linux/sched/task.h>
>>   
>> +#include <xen/xen.h>
>> +
>>   #include <asm/set_memory.h>
>>   #include <asm/e820/api.h>
>>   #include <asm/init.h>
>> @@ -1065,3 +1067,16 @@ unsigned long max_swapfile_size(void)
>>   	return pages;
>>   }
>>   #endif
>> +
>> +#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>> +int arch_has_restricted_virtio_memory_access(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_XEN_PV_VIRTIO) && xen_pv_domain())
>> +		return 1;
>> +	if (IS_ENABLED(CONFIG_XEN_HVM_VIRTIO_GRANT) && xen_hvm_domain())
>> +		return 1;
> I think these two checks could be moved to a separate function in a Xen
> header, e.g. xen_restricted_virtio_memory_access, and here you could
> just
>
> if (xen_restricted_virtio_memory_access())
>      return 1;

Agree, will do


>
>
>
>> +	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>> +}
>> +EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>> +#endif
>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>> index 50d2099..dda020f 100644
>> --- a/arch/x86/mm/mem_encrypt.c
>> +++ b/arch/x86/mm/mem_encrypt.c
>> @@ -77,8 +77,3 @@ void __init mem_encrypt_init(void)
>>   	print_mem_encrypt_feature_info();
>>   }
>>   
>> -int arch_has_restricted_virtio_memory_access(void)
>> -{
>> -	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>> -}
>> -EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>> diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
>> index 85246dd..dffdffd 100644
>> --- a/arch/x86/xen/Kconfig
>> +++ b/arch/x86/xen/Kconfig
>> @@ -92,3 +92,12 @@ config XEN_DOM0
>>   	select X86_X2APIC if XEN_PVH && X86_64
>>   	help
>>   	  Support running as a Xen Dom0 guest.
>> +
>> +config XEN_PV_VIRTIO
>> +	bool "Xen virtio support for PV guests"
>> +	depends on XEN_VIRTIO && XEN_PV
>> +	default y
>> +	help
>> +	  Support virtio for running as a paravirtualized guest. This will
>> +	  need support on the backend side (qemu or kernel, depending on the
>> +	  virtio device types used).
>> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
>> index 120d32f..fc61f7a 100644
>> --- a/drivers/xen/Kconfig
>> +++ b/drivers/xen/Kconfig
>> @@ -335,4 +335,24 @@ config XEN_UNPOPULATED_ALLOC
>>   	  having to balloon out RAM regions in order to obtain physical memory
>>   	  space to create such mappings.
>>   
>> +config XEN_VIRTIO
>> +	bool "Xen virtio support"
>> +	default n
>> +	depends on VIRTIO && DMA_OPS
>> +	select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>> +	help
>> +	  Enable virtio support for running as Xen guest. Depending on the
>> +	  guest type this will require special support on the backend side
>> +	  (qemu or kernel, depending on the virtio device types used).
>> +
>> +config XEN_HVM_VIRTIO_GRANT
>> +	bool "Require virtio for fully virtualized guests to use grant mappings"
>> +	depends on XEN_VIRTIO && X86_64
>> +	default y
>> +	help
>> +	  Require virtio for fully virtualized guests to use grant mappings.
>> +	  This will avoid the need to give the backend the right to map all
>> +	  of the guest memory. This will need support on the backend side
>> +	  (qemu or kernel, depending on the virtio device types used).
> I don't think we need 3 visible kconfig options for this.
>
> In fact, I would only add one: XEN_VIRTIO. We can have any X86 (or ARM)
> specific dependencies in the "depends" line under XEN_VIRTIO. And I
> don't think we need XEN_HVM_VIRTIO_GRANT as a kconfig option
> necessarely. It doesn't seem like some we want as build time option. At
> most, it could be a runtime option (like a command line) or a debug
> option (like an #define at the top of the source file.)


I don't know what was the initial idea of having and extra 
XEN_HVM_VIRTIO and XEN_PV_VIRTIO options, but taking into the account that
they are only used in arch_has_restricted_virtio_memory_access() 
currently, I share your opinion regarding a single XEN_VIRTIO option.

Looking ahead (including changes in the commit #4), we can imagine the 
resulting option:

config XEN_VIRTIO
     bool "Xen virtio support"
     default n
     depends on VIRTIO && DMA_OPS
     depends on (X86_64 || ARM || ARM64)
     select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
     help
       Enable virtio support for running as Xen guest. Depending on the
       guest type this will require special support on the backend side
       (qemu or kernel, depending on the virtio device types used).


and then arch_has_restricted_virtio_memory_access() per arch:


1. x86:

int arch_has_restricted_virtio_memory_access(void)
{
     return (xen_has_restricted_virtio_memory_access() ||
             cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
}


2. Arm:

int arch_has_restricted_virtio_memory_access(void)
{
     return xen_has_restricted_virtio_memory_access();
}


3. xen.h:

static inline int xen_has_restricted_virtio_memory_access(void)
{
     if (IS_ENABLED(CONFIG_XEN_VIRTIO) && (xen_pv_domain() || 
xen_hvm_domain()))
         return 1;

     return 0;
}


Actually, as domain type on Arm is always XEN_HVM_DOMAIN, we could 
probably have the following on Arm:

int arch_has_restricted_virtio_memory_access(void)
{
     return IS_ENABLED(CONFIG_XEN_VIRTIO);
}

but I would prefer not to diverge and use common 
xen_has_restricted_virtio_memory_access().

Any thoughts?



>
>
>>   endmenu
>> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
>> index 5aae66e..767009c 100644
>> --- a/drivers/xen/Makefile
>> +++ b/drivers/xen/Makefile
>> @@ -39,3 +39,4 @@ xen-gntalloc-y				:= gntalloc.o
>>   xen-privcmd-y				:= privcmd.o privcmd-buf.o
>>   obj-$(CONFIG_XEN_FRONT_PGDIR_SHBUF)	+= xen-front-pgdir-shbuf.o
>>   obj-$(CONFIG_XEN_UNPOPULATED_ALLOC)	+= unpopulated-alloc.o
>> +obj-$(CONFIG_XEN_VIRTIO)		+= xen-virtio.o
>> diff --git a/drivers/xen/xen-virtio.c b/drivers/xen/xen-virtio.c
>> new file mode 100644
>> index 00000000..cfd5eda
>> --- /dev/null
>> +++ b/drivers/xen/xen-virtio.c
>> @@ -0,0 +1,177 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/******************************************************************************
>> + * Xen virtio driver - enables using virtio devices in Xen guests.
>> + *
>> + * Copyright (c) 2021, Juergen Gross <jgross@suse.com>
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/dma-map-ops.h>
>> +#include <linux/pci.h>
>> +#include <linux/pfn.h>
>> +#include <linux/virtio_config.h>
>> +#include <xen/xen.h>
>> +#include <xen/grant_table.h>
>> +
>> +#define XEN_GRANT_ADDR_OFF	0x8000000000000000ULL
> NIT: (1ULL << 31)

ok, I assume you meant (1ULL << 63)?


>
>
>> +static inline dma_addr_t grant_to_dma(grant_ref_t grant)
>> +{
>> +	return XEN_GRANT_ADDR_OFF | ((dma_addr_t)grant << PAGE_SHIFT);
>> +}
>> +
>> +static inline grant_ref_t dma_to_grant(dma_addr_t dma)
>> +{
>> +	return (grant_ref_t)((dma & ~XEN_GRANT_ADDR_OFF) >> PAGE_SHIFT);
>> +}
>> +
>> +/*
>> + * DMA ops for Xen virtio frontends.
>> + *
>> + * Used to act as a kind of software IOMMU for Xen guests by using grants as
>> + * DMA addresses.
>> + * Such a DMA address is formed by using the grant reference as a frame
>> + * number and setting the highest address bit (this bit is for the backend
>> + * to be able to distinguish it from e.g. a mmio address).
>> + *
>> + * Note that for now we hard wire dom0 to be the backend domain. In order to
>> + * support any domain as backend we'd need to add a way to communicate the
>> + * domid of this backend, e.g. via Xenstore or via the PCI-device's config
>> + * space.
> I would add device tree as possible way of domid communication

I agree, but changes in the commit #4 (which add DT support and remove 
hardcoded domid 0) render this comment stale. For the next version I 
will squash changes and drop or rephrase this comment.


>
>
>> + */
>> +static void *xen_virtio_dma_alloc(struct device *dev, size_t size,
>> +				  dma_addr_t *dma_handle, gfp_t gfp,
>> +				  unsigned long attrs)
>> +{
>> +	unsigned int n_pages = PFN_UP(size);
>> +	unsigned int i;
>> +	unsigned long pfn;
>> +	grant_ref_t grant;
>> +	void *ret;
>> +
>> +	ret = (void *)__get_free_pages(gfp, get_order(size));
>> +	if (!ret)
>> +		return NULL;
>> +
>> +	pfn = virt_to_pfn(ret);
>> +
>> +	if (gnttab_alloc_grant_reference_seq(n_pages, &grant)) {
>> +		free_pages((unsigned long)ret, get_order(size));
>> +		return NULL;
>> +	}
>> +
>> +	for (i = 0; i < n_pages; i++) {
>> +		gnttab_grant_foreign_access_ref(grant + i, 0,
>> +						pfn_to_gfn(pfn + i), 0);
>> +	}
>> +
>> +	*dma_handle = grant_to_dma(grant);
>> +
>> +	return ret;
>> +}
>> +
>> +static void xen_virtio_dma_free(struct device *dev, size_t size, void *vaddr,
>> +				dma_addr_t dma_handle, unsigned long attrs)
>> +{
>> +	unsigned int n_pages = PFN_UP(size);
>> +	unsigned int i;
>> +	grant_ref_t grant;
>> +
>> +	grant = dma_to_grant(dma_handle);
>> +
>> +	for (i = 0; i < n_pages; i++)
>> +		gnttab_end_foreign_access_ref(grant + i);
>> +
>> +	gnttab_free_grant_reference_seq(grant, n_pages);
>> +
>> +	free_pages((unsigned long)vaddr, get_order(size));
>> +}
>> +
>> +static struct page *xen_virtio_dma_alloc_pages(struct device *dev, size_t size,
>> +					       dma_addr_t *dma_handle,
>> +					       enum dma_data_direction dir,
>> +					       gfp_t gfp)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_alloc_pages size %ld\n", size);
>> +	return NULL;
>> +}
>> +
>> +static void xen_virtio_dma_free_pages(struct device *dev, size_t size,
>> +				      struct page *vaddr, dma_addr_t dma_handle,
>> +				      enum dma_data_direction dir)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_free_pages size %ld\n", size);
>> +}
>> +
>> +static dma_addr_t xen_virtio_dma_map_page(struct device *dev, struct page *page,
>> +					  unsigned long offset, size_t size,
>> +					  enum dma_data_direction dir,
>> +					  unsigned long attrs)
>> +{
>> +	grant_ref_t grant;
>> +
>> +	if (gnttab_alloc_grant_references(1, &grant))
>> +		return 0;
>> +
>> +	gnttab_grant_foreign_access_ref(grant, 0, xen_page_to_gfn(page),
>> +					dir == DMA_TO_DEVICE);
>> +	return grant_to_dma(grant) + offset;
>> +}
>> +
>> +static void xen_virtio_dma_unmap_page(struct device *dev, dma_addr_t dma_handle,
>> +				      size_t size, enum dma_data_direction dir,
>> +				      unsigned long attrs)
>> +{
>> +	grant_ref_t grant;
>> +
>> +	grant = dma_to_grant(dma_handle);
>> +
>> +	gnttab_end_foreign_access_ref(grant);
>> +
>> +	gnttab_free_grant_reference(grant);
>> +}
>> +
>> +static int xen_virtio_dma_map_sg(struct device *dev, struct scatterlist *sg,
>> +				 int nents, enum dma_data_direction dir,
>> +				 unsigned long attrs)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_map_sg nents %d\n", nents);
>> +	return -EINVAL;
>> +}
>> +
>> +static void xen_virtio_dma_unmap_sg(struct device *dev, struct scatterlist *sg,
>> +				    int nents, enum dma_data_direction dir,
>> +				    unsigned long attrs)
>> +{
>> +	WARN_ONCE(1, "xen_virtio_dma_unmap_sg nents %d\n", nents);
>> +}
> You can implement xen_virtio_dma_map_sg and xen_virtio_dma_unmap_sg
> based on xen_virtio_dma_map_page and xen_virtio_dma_unmap_page, like we
> do in drivers/xen/swiotlb-xen.c.

Good point, thank you, will implement.


>
>
>> +static int xen_virtio_dma_dma_supported(struct device *dev, u64 mask)
>> +{
>> +	return 1;
>> +}
>> +
>> +static const struct dma_map_ops xen_virtio_dma_ops = {
>> +	.alloc = xen_virtio_dma_alloc,
>> +	.free = xen_virtio_dma_free,
>> +	.alloc_pages = xen_virtio_dma_alloc_pages,
>> +	.free_pages = xen_virtio_dma_free_pages,
>> +	.mmap = dma_common_mmap,
>> +	.get_sgtable = dma_common_get_sgtable,
>> +	.map_page = xen_virtio_dma_map_page,
>> +	.unmap_page = xen_virtio_dma_unmap_page,
>> +	.map_sg = xen_virtio_dma_map_sg,
>> +	.unmap_sg = xen_virtio_dma_unmap_sg,
>> +	.dma_supported = xen_virtio_dma_dma_supported,
>> +};
>> +
>> +void xen_virtio_setup_dma_ops(struct device *dev)
>> +{
>> +	dev->dma_ops = &xen_virtio_dma_ops;
>> +}
>> +EXPORT_SYMBOL_GPL(xen_virtio_setup_dma_ops);
>> +
>> +MODULE_DESCRIPTION("Xen virtio support driver");
>> +MODULE_AUTHOR("Juergen Gross <jgross@suse.com>");
>> +MODULE_LICENSE("GPL");
>> diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h
>> index a3584a3..ae3c1bc 100644
>> --- a/include/xen/xen-ops.h
>> +++ b/include/xen/xen-ops.h
>> @@ -221,4 +221,12 @@ static inline void xen_preemptible_hcall_end(void) { }
>>   
>>   #endif /* CONFIG_XEN_PV && !CONFIG_PREEMPTION */
>>   
>> +#ifdef CONFIG_XEN_VIRTIO
>> +void xen_virtio_setup_dma_ops(struct device *dev);
>> +#else
>> +static inline void xen_virtio_setup_dma_ops(struct device *dev)
>> +{
>> +}
>> +#endif /* CONFIG_XEN_VIRTIO */
>> +
>>   #endif /* INCLUDE_XEN_OPS_H */
>> -- 
>> 2.7.4
>>
Oleksandr Tyshchenko April 17, 2022, 5:24 p.m. UTC | #7
On 16.04.22 01:01, Stefano Stabellini wrote:

Hello Stefano


> On Thu, 14 Apr 2022, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> Introduce Xen specific binding for the virtio-mmio device to be used
>> by Xen virtio support driver in a subsequent commit.
>>
>> This binding specifies the ID of Xen domain where the corresponding
>> device (backend) resides. This is needed for the option to restrict
>> memory access using Xen grant mappings to work.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> ---
>>   .../devicetree/bindings/virtio/xen,dev-domid.yaml  | 39 ++++++++++++++++++++++
>>   1 file changed, 39 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/virtio/xen,dev-domid.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/virtio/xen,dev-domid.yaml b/Documentation/devicetree/bindings/virtio/xen,dev-domid.yaml
>> new file mode 100644
>> index 00000000..78be993
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/virtio/xen,dev-domid.yaml
>> @@ -0,0 +1,39 @@
>> +# SPDX-License-Identifier: (GPL-2.0-only or BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/virtio/xen,dev-domid.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Xen specific binding for the virtio device
>> +
>> +maintainers:
>> +  - Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> +
>> +select: true
>> +
>> +description:
>> +  This binding specifies the ID of Xen domain where the corresponding device
>> +  (backend) resides. This is needed for the option to restrict memory access
>> +  using Xen grant mappings to work.
>> +
>> +  Note that current and generic "iommus" bindings are mutually exclusive, since
>> +  the restricted memory access model on Xen behaves as a kind of software IOMMU.
> I don't think that this last statement is necessary or fully accurate, so
> I would remove it.


ok, will remove


> Other than that, this looks good to me.


thank you


>
>
>> +properties:
>> +  xen,dev-domid:
>> +    $ref: /schemas/types.yaml#/definitions/uint32
>> +    description:
>> +      Should contain the ID of device's domain.
> Maybe better as:
> "The domid (domain ID) of the domain where the device (backend) is running"


ok, will change


>
>
>
>> +additionalProperties: true
>> +
>> +examples:
>> +  - |
>> +    virtio_block@3000 {
>> +            compatible = "virtio,mmio";
>> +            reg = <0x3000 0x100>;
>> +            interrupts = <41>;
>> +
>> +            /* The device is located in Xen domain with ID 1 */
>> +            xen,dev-domid = <1>;
>> +    };
>> -- 
>> 2.7.4
>>
Stefano Stabellini April 18, 2022, 7:11 p.m. UTC | #8
On Sun, 17 Apr 2022, Oleksandr wrote:
> On 16.04.22 01:01, Stefano Stabellini wrote:
> > On Thu, 14 Apr 2022, Oleksandr Tyshchenko wrote:
> > > From: Juergen Gross <jgross@suse.com>
> > > 
> > > In order to support virtio in Xen guests add a config option enabling
> > > the user to specify whether in all Xen guests virtio should be able to
> > > access memory via Xen grant mappings only on the host side.
> > > 
> > > This applies to fully virtualized guests only, as for paravirtualized
> > > guests this is mandatory.
> > > 
> > > This requires to switch arch_has_restricted_virtio_memory_access()
> > > from a pure stub to a real function on x86 systems (Arm systems are
> > > not covered by now).
> > > 
> > > Add the needed functionality by providing a special set of DMA ops
> > > handling the needed grant operations for the I/O pages.
> > > 
> > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > ---
> > >   arch/x86/mm/init.c        |  15 ++++
> > >   arch/x86/mm/mem_encrypt.c |   5 --
> > >   arch/x86/xen/Kconfig      |   9 +++
> > >   drivers/xen/Kconfig       |  20 ++++++
> > >   drivers/xen/Makefile      |   1 +
> > >   drivers/xen/xen-virtio.c  | 177
> > > ++++++++++++++++++++++++++++++++++++++++++++++
> > >   include/xen/xen-ops.h     |   8 +++
> > >   7 files changed, 230 insertions(+), 5 deletions(-)
> > >   create mode 100644 drivers/xen/xen-virtio.c
> > > 
> > > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> > > index d8cfce2..526a3b2 100644
> > > --- a/arch/x86/mm/init.c
> > > +++ b/arch/x86/mm/init.c
> > > @@ -8,6 +8,8 @@
> > >   #include <linux/kmemleak.h>
> > >   #include <linux/sched/task.h>
> > >   +#include <xen/xen.h>
> > > +
> > >   #include <asm/set_memory.h>
> > >   #include <asm/e820/api.h>
> > >   #include <asm/init.h>
> > > @@ -1065,3 +1067,16 @@ unsigned long max_swapfile_size(void)
> > >   	return pages;
> > >   }
> > >   #endif
> > > +
> > > +#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
> > > +int arch_has_restricted_virtio_memory_access(void)
> > > +{
> > > +	if (IS_ENABLED(CONFIG_XEN_PV_VIRTIO) && xen_pv_domain())
> > > +		return 1;
> > > +	if (IS_ENABLED(CONFIG_XEN_HVM_VIRTIO_GRANT) && xen_hvm_domain())
> > > +		return 1;
> > I think these two checks could be moved to a separate function in a Xen
> > header, e.g. xen_restricted_virtio_memory_access, and here you could
> > just
> > 
> > if (xen_restricted_virtio_memory_access())
> >      return 1;
> 
> Agree, will do
> 
> 
> > 
> > 
> > 
> > > +	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
> > > +}
> > > +EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
> > > +#endif
> > > diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
> > > index 50d2099..dda020f 100644
> > > --- a/arch/x86/mm/mem_encrypt.c
> > > +++ b/arch/x86/mm/mem_encrypt.c
> > > @@ -77,8 +77,3 @@ void __init mem_encrypt_init(void)
> > >   	print_mem_encrypt_feature_info();
> > >   }
> > >   -int arch_has_restricted_virtio_memory_access(void)
> > > -{
> > > -	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
> > > -}
> > > -EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
> > > diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
> > > index 85246dd..dffdffd 100644
> > > --- a/arch/x86/xen/Kconfig
> > > +++ b/arch/x86/xen/Kconfig
> > > @@ -92,3 +92,12 @@ config XEN_DOM0
> > >   	select X86_X2APIC if XEN_PVH && X86_64
> > >   	help
> > >   	  Support running as a Xen Dom0 guest.
> > > +
> > > +config XEN_PV_VIRTIO
> > > +	bool "Xen virtio support for PV guests"
> > > +	depends on XEN_VIRTIO && XEN_PV
> > > +	default y
> > > +	help
> > > +	  Support virtio for running as a paravirtualized guest. This will
> > > +	  need support on the backend side (qemu or kernel, depending on the
> > > +	  virtio device types used).
> > > diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
> > > index 120d32f..fc61f7a 100644
> > > --- a/drivers/xen/Kconfig
> > > +++ b/drivers/xen/Kconfig
> > > @@ -335,4 +335,24 @@ config XEN_UNPOPULATED_ALLOC
> > >   	  having to balloon out RAM regions in order to obtain physical memory
> > >   	  space to create such mappings.
> > >   +config XEN_VIRTIO
> > > +	bool "Xen virtio support"
> > > +	default n
> > > +	depends on VIRTIO && DMA_OPS
> > > +	select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
> > > +	help
> > > +	  Enable virtio support for running as Xen guest. Depending on the
> > > +	  guest type this will require special support on the backend side
> > > +	  (qemu or kernel, depending on the virtio device types used).
> > > +
> > > +config XEN_HVM_VIRTIO_GRANT
> > > +	bool "Require virtio for fully virtualized guests to use grant
> > > mappings"
> > > +	depends on XEN_VIRTIO && X86_64
> > > +	default y
> > > +	help
> > > +	  Require virtio for fully virtualized guests to use grant mappings.
> > > +	  This will avoid the need to give the backend the right to map all
> > > +	  of the guest memory. This will need support on the backend side
> > > +	  (qemu or kernel, depending on the virtio device types used).
> > I don't think we need 3 visible kconfig options for this.
> > 
> > In fact, I would only add one: XEN_VIRTIO. We can have any X86 (or ARM)
> > specific dependencies in the "depends" line under XEN_VIRTIO. And I
> > don't think we need XEN_HVM_VIRTIO_GRANT as a kconfig option
> > necessarely. It doesn't seem like some we want as build time option. At
> > most, it could be a runtime option (like a command line) or a debug
> > option (like an #define at the top of the source file.)
> 
> 
> I don't know what was the initial idea of having and extra XEN_HVM_VIRTIO and
> XEN_PV_VIRTIO options, but taking into the account that
> they are only used in arch_has_restricted_virtio_memory_access() currently, I
> share your opinion regarding a single XEN_VIRTIO option.
> 
> Looking ahead (including changes in the commit #4), we can imagine the
> resulting option:
> 
> config XEN_VIRTIO
>     bool "Xen virtio support"
>     default n
>     depends on VIRTIO && DMA_OPS
>     depends on (X86_64 || ARM || ARM64)
>     select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>     help
>       Enable virtio support for running as Xen guest. Depending on the
>       guest type this will require special support on the backend side
>       (qemu or kernel, depending on the virtio device types used).
> 
> 
> and then arch_has_restricted_virtio_memory_access() per arch:
> 
> 
> 1. x86:
> 
> int arch_has_restricted_virtio_memory_access(void)
> {
>     return (xen_has_restricted_virtio_memory_access() ||
>             cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
> }
> 
> 
> 2. Arm:
> 
> int arch_has_restricted_virtio_memory_access(void)
> {
>     return xen_has_restricted_virtio_memory_access();
> }
> 
> 
> 3. xen.h:
> 
> static inline int xen_has_restricted_virtio_memory_access(void)
> {
>     if (IS_ENABLED(CONFIG_XEN_VIRTIO) && (xen_pv_domain() ||
> xen_hvm_domain()))
>         return 1;
> 
>     return 0;
> }
> 
> 
> Actually, as domain type on Arm is always XEN_HVM_DOMAIN, we could probably
> have the following on Arm:
> 
> int arch_has_restricted_virtio_memory_access(void)
> {
>     return IS_ENABLED(CONFIG_XEN_VIRTIO);
> }
> 
> but I would prefer not to diverge and use common
> xen_has_restricted_virtio_memory_access().
> 
> Any thoughts?

Yes, I would also prefer not to diverge between the x86 and arm versions
of xen_has_restricted_virtio_memory_access. But what case are we trying
to catch with (xen_pv_domain() || xen_hvm_domain()) ? Even on x86, it is
not going to leave much out. Is it really meant only to exclude pvh
domains?

I have the feeling that we could turn this check into:

static inline int xen_has_restricted_virtio_memory_access(void)
{
    return IS_ENABLED(CONFIG_XEN_VIRTIO) && xen_domain();
}

even on x86, but one of the xen/x86 maintainers should confirm.
Jürgen Groß April 19, 2022, 6:21 a.m. UTC | #9
On 18.04.22 21:11, Stefano Stabellini wrote:
> On Sun, 17 Apr 2022, Oleksandr wrote:
>> On 16.04.22 01:01, Stefano Stabellini wrote:
>>> On Thu, 14 Apr 2022, Oleksandr Tyshchenko wrote:
>>>> From: Juergen Gross <jgross@suse.com>
>>>>
>>>> In order to support virtio in Xen guests add a config option enabling
>>>> the user to specify whether in all Xen guests virtio should be able to
>>>> access memory via Xen grant mappings only on the host side.
>>>>
>>>> This applies to fully virtualized guests only, as for paravirtualized
>>>> guests this is mandatory.
>>>>
>>>> This requires to switch arch_has_restricted_virtio_memory_access()
>>>> from a pure stub to a real function on x86 systems (Arm systems are
>>>> not covered by now).
>>>>
>>>> Add the needed functionality by providing a special set of DMA ops
>>>> handling the needed grant operations for the I/O pages.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>>    arch/x86/mm/init.c        |  15 ++++
>>>>    arch/x86/mm/mem_encrypt.c |   5 --
>>>>    arch/x86/xen/Kconfig      |   9 +++
>>>>    drivers/xen/Kconfig       |  20 ++++++
>>>>    drivers/xen/Makefile      |   1 +
>>>>    drivers/xen/xen-virtio.c  | 177
>>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>>    include/xen/xen-ops.h     |   8 +++
>>>>    7 files changed, 230 insertions(+), 5 deletions(-)
>>>>    create mode 100644 drivers/xen/xen-virtio.c
>>>>
>>>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>>>> index d8cfce2..526a3b2 100644
>>>> --- a/arch/x86/mm/init.c
>>>> +++ b/arch/x86/mm/init.c
>>>> @@ -8,6 +8,8 @@
>>>>    #include <linux/kmemleak.h>
>>>>    #include <linux/sched/task.h>
>>>>    +#include <xen/xen.h>
>>>> +
>>>>    #include <asm/set_memory.h>
>>>>    #include <asm/e820/api.h>
>>>>    #include <asm/init.h>
>>>> @@ -1065,3 +1067,16 @@ unsigned long max_swapfile_size(void)
>>>>    	return pages;
>>>>    }
>>>>    #endif
>>>> +
>>>> +#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>>>> +int arch_has_restricted_virtio_memory_access(void)
>>>> +{
>>>> +	if (IS_ENABLED(CONFIG_XEN_PV_VIRTIO) && xen_pv_domain())
>>>> +		return 1;
>>>> +	if (IS_ENABLED(CONFIG_XEN_HVM_VIRTIO_GRANT) && xen_hvm_domain())
>>>> +		return 1;
>>> I think these two checks could be moved to a separate function in a Xen
>>> header, e.g. xen_restricted_virtio_memory_access, and here you could
>>> just
>>>
>>> if (xen_restricted_virtio_memory_access())
>>>       return 1;
>>
>> Agree, will do
>>
>>
>>>
>>>
>>>
>>>> +	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>>>> +#endif
>>>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>>>> index 50d2099..dda020f 100644
>>>> --- a/arch/x86/mm/mem_encrypt.c
>>>> +++ b/arch/x86/mm/mem_encrypt.c
>>>> @@ -77,8 +77,3 @@ void __init mem_encrypt_init(void)
>>>>    	print_mem_encrypt_feature_info();
>>>>    }
>>>>    -int arch_has_restricted_virtio_memory_access(void)
>>>> -{
>>>> -	return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>>>> -}
>>>> -EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>>>> diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
>>>> index 85246dd..dffdffd 100644
>>>> --- a/arch/x86/xen/Kconfig
>>>> +++ b/arch/x86/xen/Kconfig
>>>> @@ -92,3 +92,12 @@ config XEN_DOM0
>>>>    	select X86_X2APIC if XEN_PVH && X86_64
>>>>    	help
>>>>    	  Support running as a Xen Dom0 guest.
>>>> +
>>>> +config XEN_PV_VIRTIO
>>>> +	bool "Xen virtio support for PV guests"
>>>> +	depends on XEN_VIRTIO && XEN_PV
>>>> +	default y
>>>> +	help
>>>> +	  Support virtio for running as a paravirtualized guest. This will
>>>> +	  need support on the backend side (qemu or kernel, depending on the
>>>> +	  virtio device types used).
>>>> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
>>>> index 120d32f..fc61f7a 100644
>>>> --- a/drivers/xen/Kconfig
>>>> +++ b/drivers/xen/Kconfig
>>>> @@ -335,4 +335,24 @@ config XEN_UNPOPULATED_ALLOC
>>>>    	  having to balloon out RAM regions in order to obtain physical memory
>>>>    	  space to create such mappings.
>>>>    +config XEN_VIRTIO
>>>> +	bool "Xen virtio support"
>>>> +	default n
>>>> +	depends on VIRTIO && DMA_OPS
>>>> +	select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>>>> +	help
>>>> +	  Enable virtio support for running as Xen guest. Depending on the
>>>> +	  guest type this will require special support on the backend side
>>>> +	  (qemu or kernel, depending on the virtio device types used).
>>>> +
>>>> +config XEN_HVM_VIRTIO_GRANT
>>>> +	bool "Require virtio for fully virtualized guests to use grant
>>>> mappings"
>>>> +	depends on XEN_VIRTIO && X86_64
>>>> +	default y
>>>> +	help
>>>> +	  Require virtio for fully virtualized guests to use grant mappings.
>>>> +	  This will avoid the need to give the backend the right to map all
>>>> +	  of the guest memory. This will need support on the backend side
>>>> +	  (qemu or kernel, depending on the virtio device types used).
>>> I don't think we need 3 visible kconfig options for this.
>>>
>>> In fact, I would only add one: XEN_VIRTIO. We can have any X86 (or ARM)
>>> specific dependencies in the "depends" line under XEN_VIRTIO. And I
>>> don't think we need XEN_HVM_VIRTIO_GRANT as a kconfig option
>>> necessarely. It doesn't seem like some we want as build time option. At
>>> most, it could be a runtime option (like a command line) or a debug
>>> option (like an #define at the top of the source file.)
>>
>>
>> I don't know what was the initial idea of having and extra XEN_HVM_VIRTIO and
>> XEN_PV_VIRTIO options, but taking into the account that
>> they are only used in arch_has_restricted_virtio_memory_access() currently, I
>> share your opinion regarding a single XEN_VIRTIO option.
>>
>> Looking ahead (including changes in the commit #4), we can imagine the
>> resulting option:
>>
>> config XEN_VIRTIO
>>      bool "Xen virtio support"
>>      default n
>>      depends on VIRTIO && DMA_OPS
>>      depends on (X86_64 || ARM || ARM64)
>>      select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>>      help
>>        Enable virtio support for running as Xen guest. Depending on the
>>        guest type this will require special support on the backend side
>>        (qemu or kernel, depending on the virtio device types used).
>>
>>
>> and then arch_has_restricted_virtio_memory_access() per arch:
>>
>>
>> 1. x86:
>>
>> int arch_has_restricted_virtio_memory_access(void)
>> {
>>      return (xen_has_restricted_virtio_memory_access() ||
>>              cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
>> }
>>
>>
>> 2. Arm:
>>
>> int arch_has_restricted_virtio_memory_access(void)
>> {
>>      return xen_has_restricted_virtio_memory_access();
>> }
>>
>>
>> 3. xen.h:
>>
>> static inline int xen_has_restricted_virtio_memory_access(void)
>> {
>>      if (IS_ENABLED(CONFIG_XEN_VIRTIO) && (xen_pv_domain() ||
>> xen_hvm_domain()))
>>          return 1;
>>
>>      return 0;
>> }
>>
>>
>> Actually, as domain type on Arm is always XEN_HVM_DOMAIN, we could probably
>> have the following on Arm:
>>
>> int arch_has_restricted_virtio_memory_access(void)
>> {
>>      return IS_ENABLED(CONFIG_XEN_VIRTIO);
>> }
>>
>> but I would prefer not to diverge and use common
>> xen_has_restricted_virtio_memory_access().
>>
>> Any thoughts?
> 
> Yes, I would also prefer not to diverge between the x86 and arm versions
> of xen_has_restricted_virtio_memory_access. But what case are we trying
> to catch with (xen_pv_domain() || xen_hvm_domain()) ? Even on x86, it is
> not going to leave much out. Is it really meant only to exclude pvh
> domains?

It wouldn't exclude pvh domains.

> 
> I have the feeling that we could turn this check into:
> 
> static inline int xen_has_restricted_virtio_memory_access(void)
> {
>      return IS_ENABLED(CONFIG_XEN_VIRTIO) && xen_domain();
> }
> 
> even on x86, but one of the xen/x86 maintainers should confirm.

I do confirm this is better and functionally equivalent.


Juergen
Oleksandr Tyshchenko April 19, 2022, 6:37 a.m. UTC | #10
Hello Stefano, Juergen


On 19.04.22 09:21, Juergen Gross wrote:
> On 18.04.22 21:11, Stefano Stabellini wrote:
>> On Sun, 17 Apr 2022, Oleksandr wrote:
>>> On 16.04.22 01:01, Stefano Stabellini wrote:
>>>> On Thu, 14 Apr 2022, Oleksandr Tyshchenko wrote:
>>>>> From: Juergen Gross <jgross@suse.com>
>>>>>
>>>>> In order to support virtio in Xen guests add a config option enabling
>>>>> the user to specify whether in all Xen guests virtio should be 
>>>>> able to
>>>>> access memory via Xen grant mappings only on the host side.
>>>>>
>>>>> This applies to fully virtualized guests only, as for paravirtualized
>>>>> guests this is mandatory.
>>>>>
>>>>> This requires to switch arch_has_restricted_virtio_memory_access()
>>>>> from a pure stub to a real function on x86 systems (Arm systems are
>>>>> not covered by now).
>>>>>
>>>>> Add the needed functionality by providing a special set of DMA ops
>>>>> handling the needed grant operations for the I/O pages.
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>> ---
>>>>>    arch/x86/mm/init.c        |  15 ++++
>>>>>    arch/x86/mm/mem_encrypt.c |   5 --
>>>>>    arch/x86/xen/Kconfig      |   9 +++
>>>>>    drivers/xen/Kconfig       |  20 ++++++
>>>>>    drivers/xen/Makefile      |   1 +
>>>>>    drivers/xen/xen-virtio.c  | 177
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    include/xen/xen-ops.h     |   8 +++
>>>>>    7 files changed, 230 insertions(+), 5 deletions(-)
>>>>>    create mode 100644 drivers/xen/xen-virtio.c
>>>>>
>>>>> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
>>>>> index d8cfce2..526a3b2 100644
>>>>> --- a/arch/x86/mm/init.c
>>>>> +++ b/arch/x86/mm/init.c
>>>>> @@ -8,6 +8,8 @@
>>>>>    #include <linux/kmemleak.h>
>>>>>    #include <linux/sched/task.h>
>>>>>    +#include <xen/xen.h>
>>>>> +
>>>>>    #include <asm/set_memory.h>
>>>>>    #include <asm/e820/api.h>
>>>>>    #include <asm/init.h>
>>>>> @@ -1065,3 +1067,16 @@ unsigned long max_swapfile_size(void)
>>>>>        return pages;
>>>>>    }
>>>>>    #endif
>>>>> +
>>>>> +#ifdef CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>>>>> +int arch_has_restricted_virtio_memory_access(void)
>>>>> +{
>>>>> +    if (IS_ENABLED(CONFIG_XEN_PV_VIRTIO) && xen_pv_domain())
>>>>> +        return 1;
>>>>> +    if (IS_ENABLED(CONFIG_XEN_HVM_VIRTIO_GRANT) && xen_hvm_domain())
>>>>> +        return 1;
>>>> I think these two checks could be moved to a separate function in a 
>>>> Xen
>>>> header, e.g. xen_restricted_virtio_memory_access, and here you could
>>>> just
>>>>
>>>> if (xen_restricted_virtio_memory_access())
>>>>       return 1;
>>>
>>> Agree, will do
>>>
>>>
>>>>
>>>>
>>>>
>>>>> +    return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>>>>> +#endif
>>>>> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
>>>>> index 50d2099..dda020f 100644
>>>>> --- a/arch/x86/mm/mem_encrypt.c
>>>>> +++ b/arch/x86/mm/mem_encrypt.c
>>>>> @@ -77,8 +77,3 @@ void __init mem_encrypt_init(void)
>>>>>        print_mem_encrypt_feature_info();
>>>>>    }
>>>>>    -int arch_has_restricted_virtio_memory_access(void)
>>>>> -{
>>>>> -    return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
>>>>> -}
>>>>> -EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
>>>>> diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
>>>>> index 85246dd..dffdffd 100644
>>>>> --- a/arch/x86/xen/Kconfig
>>>>> +++ b/arch/x86/xen/Kconfig
>>>>> @@ -92,3 +92,12 @@ config XEN_DOM0
>>>>>        select X86_X2APIC if XEN_PVH && X86_64
>>>>>        help
>>>>>          Support running as a Xen Dom0 guest.
>>>>> +
>>>>> +config XEN_PV_VIRTIO
>>>>> +    bool "Xen virtio support for PV guests"
>>>>> +    depends on XEN_VIRTIO && XEN_PV
>>>>> +    default y
>>>>> +    help
>>>>> +      Support virtio for running as a paravirtualized guest. This 
>>>>> will
>>>>> +      need support on the backend side (qemu or kernel, depending 
>>>>> on the
>>>>> +      virtio device types used).
>>>>> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
>>>>> index 120d32f..fc61f7a 100644
>>>>> --- a/drivers/xen/Kconfig
>>>>> +++ b/drivers/xen/Kconfig
>>>>> @@ -335,4 +335,24 @@ config XEN_UNPOPULATED_ALLOC
>>>>>          having to balloon out RAM regions in order to obtain 
>>>>> physical memory
>>>>>          space to create such mappings.
>>>>>    +config XEN_VIRTIO
>>>>> +    bool "Xen virtio support"
>>>>> +    default n
>>>>> +    depends on VIRTIO && DMA_OPS
>>>>> +    select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>>>>> +    help
>>>>> +      Enable virtio support for running as Xen guest. Depending 
>>>>> on the
>>>>> +      guest type this will require special support on the backend 
>>>>> side
>>>>> +      (qemu or kernel, depending on the virtio device types used).
>>>>> +
>>>>> +config XEN_HVM_VIRTIO_GRANT
>>>>> +    bool "Require virtio for fully virtualized guests to use grant
>>>>> mappings"
>>>>> +    depends on XEN_VIRTIO && X86_64
>>>>> +    default y
>>>>> +    help
>>>>> +      Require virtio for fully virtualized guests to use grant 
>>>>> mappings.
>>>>> +      This will avoid the need to give the backend the right to 
>>>>> map all
>>>>> +      of the guest memory. This will need support on the backend 
>>>>> side
>>>>> +      (qemu or kernel, depending on the virtio device types used).
>>>> I don't think we need 3 visible kconfig options for this.
>>>>
>>>> In fact, I would only add one: XEN_VIRTIO. We can have any X86 (or 
>>>> ARM)
>>>> specific dependencies in the "depends" line under XEN_VIRTIO. And I
>>>> don't think we need XEN_HVM_VIRTIO_GRANT as a kconfig option
>>>> necessarely. It doesn't seem like some we want as build time 
>>>> option. At
>>>> most, it could be a runtime option (like a command line) or a debug
>>>> option (like an #define at the top of the source file.)
>>>
>>>
>>> I don't know what was the initial idea of having and extra 
>>> XEN_HVM_VIRTIO and
>>> XEN_PV_VIRTIO options, but taking into the account that
>>> they are only used in arch_has_restricted_virtio_memory_access() 
>>> currently, I
>>> share your opinion regarding a single XEN_VIRTIO option.
>>>
>>> Looking ahead (including changes in the commit #4), we can imagine the
>>> resulting option:
>>>
>>> config XEN_VIRTIO
>>>      bool "Xen virtio support"
>>>      default n
>>>      depends on VIRTIO && DMA_OPS
>>>      depends on (X86_64 || ARM || ARM64)
>>>      select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
>>>      help
>>>        Enable virtio support for running as Xen guest. Depending on the
>>>        guest type this will require special support on the backend side
>>>        (qemu or kernel, depending on the virtio device types used).
>>>
>>>
>>> and then arch_has_restricted_virtio_memory_access() per arch:
>>>
>>>
>>> 1. x86:
>>>
>>> int arch_has_restricted_virtio_memory_access(void)
>>> {
>>>      return (xen_has_restricted_virtio_memory_access() ||
>>>              cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
>>> }
>>>
>>>
>>> 2. Arm:
>>>
>>> int arch_has_restricted_virtio_memory_access(void)
>>> {
>>>      return xen_has_restricted_virtio_memory_access();
>>> }
>>>
>>>
>>> 3. xen.h:
>>>
>>> static inline int xen_has_restricted_virtio_memory_access(void)
>>> {
>>>      if (IS_ENABLED(CONFIG_XEN_VIRTIO) && (xen_pv_domain() ||
>>> xen_hvm_domain()))
>>>          return 1;
>>>
>>>      return 0;
>>> }
>>>
>>>
>>> Actually, as domain type on Arm is always XEN_HVM_DOMAIN, we could 
>>> probably
>>> have the following on Arm:
>>>
>>> int arch_has_restricted_virtio_memory_access(void)
>>> {
>>>      return IS_ENABLED(CONFIG_XEN_VIRTIO);
>>> }
>>>
>>> but I would prefer not to diverge and use common
>>> xen_has_restricted_virtio_memory_access().
>>>
>>> Any thoughts?
>>
>> Yes, I would also prefer not to diverge between the x86 and arm versions
>> of xen_has_restricted_virtio_memory_access. But what case are we trying
>> to catch with (xen_pv_domain() || xen_hvm_domain()) ? Even on x86, it is
>> not going to leave much out. Is it really meant only to exclude pvh
>> domains?

Good question. By leaving (xen_pv_domain() || xen_hvm_domain()) here I 
tried to retain what the *initial* version of 
arch_has_restricted_virtio_memory_access() covered.


>
> It wouldn't exclude pvh domains.


ok


>
>>
>> I have the feeling that we could turn this check into:
>>
>> static inline int xen_has_restricted_virtio_memory_access(void)
>> {
>>      return IS_ENABLED(CONFIG_XEN_VIRTIO) && xen_domain();
>> }
>>
>> even on x86, but one of the xen/x86 maintainers should confirm.
>
> I do confirm this is better and functionally equivalent.


Perfect, thank you for confirming. Will use that check.


>
>
> Juergen