diff mbox series

[RFC,V3,09/11] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM

Message ID 20210530150628.2063957-10-ltykernel@gmail.com (mailing list archive)
State Not Applicable
Headers show
Series x86/Hyper-V: Add Hyper-V Isolation VM support | expand

Commit Message

Tianyu Lan May 30, 2021, 3:06 p.m. UTC
From: Tianyu Lan <Tianyu.Lan@microsoft.com>

Hyper-V Isolation VM requires bounce buffer support to copy
data from/to encrypted memory and so enable swiotlb force
mode to use swiotlb bounce buffer for DMA transaction.

In Isolation VM with AMD SEV, the bounce buffer needs to be
accessed via extra address space which is above shared_gpa_boundary
(E.G 39 bit address line) reported by Hyper-V CPUID ISOLATION_CONFIG.
The access physical address will be original physical address +
shared_gpa_boundary. The shared_gpa_boundary in the AMD SEV SNP
spec is called virtual top of memory(vTOM). Memory addresses below
vTOM are automatically treated as private while memory above
vTOM is treated as shared.

ioremap_cache() can't use in the hyperv_iommu_swiotlb_init() which
is too early place and remap bounce buffer in the hyperv_iommu_swiotlb_
later_init().

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
 arch/x86/xen/pci-swiotlb-xen.c |  3 +-
 drivers/hv/vmbus_drv.c         |  3 ++
 drivers/iommu/hyperv-iommu.c   | 81 ++++++++++++++++++++++++++++++++++
 include/linux/hyperv.h         |  1 +
 4 files changed, 87 insertions(+), 1 deletion(-)

Comments

Boris Ostrovsky June 2, 2021, 1:16 a.m. UTC | #1
On 5/30/21 11:06 AM, Tianyu Lan wrote:
> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void)
>  EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late);
>  
>  IOMMU_INIT_FINISH(2,
> -		  NULL,
> +		  hyperv_swiotlb_detect,
>  		  pci_xen_swiotlb_init,
>  		  NULL);


Could you explain this change?


-boris
Tianyu Lan June 2, 2021, 3:01 p.m. UTC | #2
Hi Boris:
	Thanks for your review.

On 6/2/2021 9:16 AM, Boris Ostrovsky wrote:
> 
> On 5/30/21 11:06 AM, Tianyu Lan wrote:
>> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void)
>>   EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late);
>>   
>>   IOMMU_INIT_FINISH(2,
>> -		  NULL,
>> +		  hyperv_swiotlb_detect,
>>   		  pci_xen_swiotlb_init,
>>   		  NULL);
> 
> 
> Could you explain this change?

Hyper-V allocates its own swiotlb bounce buffer and the default
swiotlb buffer should not be allocated. swiotlb_init() in 
pci_swiotlb_init() is to allocate default swiotlb buffer.
To achieve this, put hyperv_swiotlb_detect() as the first entry in the 
iommu_table_entry list. The detect loop in the pci_iommu_alloc() will 
exit once hyperv_swiotlb_detect() is called in Hyper-V VM and other 
iommu_table_entry callback will not be called.
Boris Ostrovsky June 2, 2021, 4:02 p.m. UTC | #3
On 6/2/21 11:01 AM, Tianyu Lan wrote:
> Hi Boris:
>     Thanks for your review.
>
> On 6/2/2021 9:16 AM, Boris Ostrovsky wrote:
>>
>> On 5/30/21 11:06 AM, Tianyu Lan wrote:
>>> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void)
>>>   EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late);
>>>     IOMMU_INIT_FINISH(2,
>>> -          NULL,
>>> +          hyperv_swiotlb_detect,
>>>             pci_xen_swiotlb_init,
>>>             NULL);
>>
>>
>> Could you explain this change?
>
> Hyper-V allocates its own swiotlb bounce buffer and the default
> swiotlb buffer should not be allocated. swiotlb_init() in pci_swiotlb_init() is to allocate default swiotlb buffer.
> To achieve this, put hyperv_swiotlb_detect() as the first entry in the iommu_table_entry list. The detect loop in the pci_iommu_alloc() will exit once hyperv_swiotlb_detect() is called in Hyper-V VM and other iommu_table_entry callback will not be called.



Right. But pci_xen_swiotlb_detect() will only do something for Xen PV guests, and those guests don't run on hyperV. It's either xen_pv_domain() (i.e. hypervisor_is_type(X86_HYPER_XEN_PV)) or hypervisor_is_type(X86_HYPER_MS_HYPERV) but never both. So I don't think there needs to be a dependency between the two callbacks.



-boris
Tianyu Lan June 3, 2021, 3:37 p.m. UTC | #4
On 6/3/2021 12:02 AM, Boris Ostrovsky wrote:
> 
> On 6/2/21 11:01 AM, Tianyu Lan wrote:
>> Hi Boris:
>>      Thanks for your review.
>>
>> On 6/2/2021 9:16 AM, Boris Ostrovsky wrote:
>>>
>>> On 5/30/21 11:06 AM, Tianyu Lan wrote:
>>>> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void)
>>>>    EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late);
>>>>      IOMMU_INIT_FINISH(2,
>>>> -          NULL,
>>>> +          hyperv_swiotlb_detect,
>>>>              pci_xen_swiotlb_init,
>>>>              NULL);
>>>
>>>
>>> Could you explain this change?
>>
>> Hyper-V allocates its own swiotlb bounce buffer and the default
>> swiotlb buffer should not be allocated. swiotlb_init() in pci_swiotlb_init() is to allocate default swiotlb buffer.
>> To achieve this, put hyperv_swiotlb_detect() as the first entry in the iommu_table_entry list. The detect loop in the pci_iommu_alloc() will exit once hyperv_swiotlb_detect() is called in Hyper-V VM and other iommu_table_entry callback will not be called.
> 
> 
> 
> Right. But pci_xen_swiotlb_detect() will only do something for Xen PV guests, and those guests don't run on hyperV. It's either xen_pv_domain() (i.e. hypervisor_is_type(X86_HYPER_XEN_PV)) or hypervisor_is_type(X86_HYPER_MS_HYPERV) but never both. So I don't think there needs to be a dependency between the two callbacks.

Yes, the dependency is between hyperv_swiotlb_detect() and
pci_swiotlb_detect_override()/pci_swiotlb_detect_4gb(). Now
pci_swiotlb_detect_override() and pci_swiotlb_detect_4gb() depends on
pci_xen_swiotlb_detect(). To keep dependency between
hyperv_swiotlb_detect() and pci_swiotlb_detect_override/4gb(), make 
pci_xen_swiotlb_detect() depends on hyperv_swiotlb_detect() and just to
keep order in the IOMMU table. Current iommu_table_entry only has one
depend callback and this is why I put xen depends on hyperv detect function.

Thanks.
Boris Ostrovsky June 3, 2021, 5:04 p.m. UTC | #5
On 6/3/21 11:37 AM, Tianyu Lan wrote:
>
> Yes, the dependency is between hyperv_swiotlb_detect() and
> pci_swiotlb_detect_override()/pci_swiotlb_detect_4gb(). Now
> pci_swiotlb_detect_override() and pci_swiotlb_detect_4gb() depends on
> pci_xen_swiotlb_detect(). To keep dependency between
> hyperv_swiotlb_detect() and pci_swiotlb_detect_override/4gb(), make pci_xen_swiotlb_detect() depends on hyperv_swiotlb_detect() and just to
> keep order in the IOMMU table. Current iommu_table_entry only has one
> depend callback and this is why I put xen depends on hyperv detect function.
>

Ah, ok. Thanks.



-boris
Christoph Hellwig June 7, 2021, 6:44 a.m. UTC | #6
Honestly, we really need to do away with the concept of hypervisor-
specific swiotlb allocations and just add a hypervisor hook to remap the
"main" buffer. That should remove a lot of code and confusion not just
for Xen but also any future addition like hyperv.
diff mbox series

Patch

diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c
index 54f9aa7e8457..43bd031aa332 100644
--- a/arch/x86/xen/pci-swiotlb-xen.c
+++ b/arch/x86/xen/pci-swiotlb-xen.c
@@ -4,6 +4,7 @@ 
 
 #include <linux/dma-map-ops.h>
 #include <linux/pci.h>
+#include <linux/hyperv.h>
 #include <xen/swiotlb-xen.h>
 
 #include <asm/xen/hypervisor.h>
@@ -91,6 +92,6 @@  int pci_xen_swiotlb_init_late(void)
 EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late);
 
 IOMMU_INIT_FINISH(pci_xen_swiotlb_detect,
-		  NULL,
+		  hyperv_swiotlb_detect,
 		  pci_xen_swiotlb_init,
 		  NULL);
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 92cb3f7d21d9..5e3bb76d4dee 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -23,6 +23,7 @@ 
 #include <linux/cpu.h>
 #include <linux/sched/task_stack.h>
 
+#include <linux/dma-map-ops.h>
 #include <linux/delay.h>
 #include <linux/notifier.h>
 #include <linux/ptrace.h>
@@ -2080,6 +2081,7 @@  struct hv_device *vmbus_device_create(const guid_t *type,
 	return child_device_obj;
 }
 
+static u64 vmbus_dma_mask = DMA_BIT_MASK(64);
 /*
  * vmbus_device_register - Register the child device
  */
@@ -2120,6 +2122,7 @@  int vmbus_device_register(struct hv_device *child_device_obj)
 	}
 	hv_debug_add_dev_dir(child_device_obj);
 
+	child_device_obj->device.dma_mask = &vmbus_dma_mask;
 	return 0;
 
 err_kset_unregister:
diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
index e285a220c913..2604619c6fa3 100644
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -13,14 +13,22 @@ 
 #include <linux/irq.h>
 #include <linux/iommu.h>
 #include <linux/module.h>
+#include <linux/hyperv.h>
+#include <linux/io.h>
 
 #include <asm/apic.h>
 #include <asm/cpu.h>
 #include <asm/hw_irq.h>
 #include <asm/io_apic.h>
+#include <asm/iommu.h>
+#include <asm/iommu_table.h>
 #include <asm/irq_remapping.h>
 #include <asm/hypervisor.h>
 #include <asm/mshyperv.h>
+#include <asm/swiotlb.h>
+#include <linux/dma-map-ops.h>
+#include <linux/dma-direct.h>
+#include <linux/set_memory.h>
 
 #include "irq_remapping.h"
 
@@ -36,6 +44,8 @@ 
 static cpumask_t ioapic_max_cpumask = { CPU_BITS_NONE };
 static struct irq_domain *ioapic_ir_domain;
 
+static unsigned long hyperv_io_tlb_start, hyperv_io_tlb_size;
+
 static int hyperv_ir_set_affinity(struct irq_data *data,
 		const struct cpumask *mask, bool force)
 {
@@ -337,4 +347,75 @@  static const struct irq_domain_ops hyperv_root_ir_domain_ops = {
 	.free = hyperv_root_irq_remapping_free,
 };
 
+void __init hyperv_iommu_swiotlb_init(void)
+{
+	unsigned long bytes, io_tlb_nslabs;
+	void *vstart;
+
+	/* Allocate Hyper-V swiotlb  */
+	bytes = 200 * 1024 * 1024;
+	vstart = memblock_alloc_low(PAGE_ALIGN(bytes), PAGE_SIZE);
+	io_tlb_nslabs = bytes >> IO_TLB_SHIFT;
+	hyperv_io_tlb_size = bytes;
+
+	if (!vstart) {
+		pr_warn("Fail to allocate swiotlb.\n");
+		return;
+	}
+
+	hyperv_io_tlb_start = virt_to_phys(vstart);
+	if (!hyperv_io_tlb_start)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx.\n",
+		      __func__, PAGE_ALIGN(bytes), PAGE_SIZE);
+
+	if (swiotlb_init_with_tbl(vstart, io_tlb_nslabs, 1))
+		panic("%s: Cannot allocate SWIOTLB buffer.\n", __func__);
+
+	swiotlb_set_max_segment(HV_HYP_PAGE_SIZE);
+}
+
+int __init hyperv_swiotlb_detect(void)
+{
+	if (hypervisor_is_type(X86_HYPER_MS_HYPERV)
+	    && hv_is_isolation_supported()) {
+		/*
+		 * Enable swiotlb force mode in Isolation VM to
+		 * use swiotlb bounce buffer for dma transaction.
+		 */
+		swiotlb_force = SWIOTLB_FORCE;
+		return 1;
+	}
+
+	return 0;
+}
+
+void __init hyperv_iommu_swiotlb_later_init(void)
+{
+	void *hyperv_io_tlb_remap;
+	int ret;
+
+	/* Mask bounce buffer visible to host and remap extra address. */
+	if (hv_isolation_type_snp()) {
+		ret = set_memory_decrypted((unsigned long)
+				phys_to_virt(hyperv_io_tlb_start),
+				HVPFN_UP(hyperv_io_tlb_size));
+		if (ret)
+			panic("%s: Fail to mark Hyper-v swiotlb buffer visible to host. err=%d\n",
+			      __func__, ret);
+
+		hyperv_io_tlb_remap = ioremap_cache(hyperv_io_tlb_start
+				+ ms_hyperv.shared_gpa_boundary,
+				hyperv_io_tlb_size);
+		if (!hyperv_io_tlb_remap)
+			panic("Fail to remap io tlb.\n");
+
+		memset(hyperv_io_tlb_remap, 0x00, hyperv_io_tlb_size);
+		swiotlb_set_bounce_remap(hyperv_io_tlb_remap);
+	}
+}
+
+IOMMU_INIT_FINISH(hyperv_swiotlb_detect,
+		  NULL, hyperv_iommu_swiotlb_init,
+		  hyperv_iommu_swiotlb_later_init);
+
 #endif
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 06eccaba10c5..babbe19f57e2 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1759,6 +1759,7 @@  int hyperv_write_cfg_blk(struct pci_dev *dev, void *buf, unsigned int len,
 int hyperv_reg_block_invalidate(struct pci_dev *dev, void *context,
 				void (*block_invalidate)(void *context,
 							 u64 block_mask));
+int __init hyperv_swiotlb_detect(void);
 
 struct hyperv_pci_block_ops {
 	int (*read_block)(struct pci_dev *dev, void *buf, unsigned int buf_len,