Message ID | 20211116153923.196763-4-ltykernel@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/Hyper-V: Add Hyper-V Isolation VM support(Second part) | expand |
On Tue, Nov 16, 2021 at 10:39:21AM -0500, Tianyu Lan wrote: > diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c > index 35487305d8af..65bc385ae07a 100644 > --- a/arch/x86/mm/mem_encrypt.c > +++ b/arch/x86/mm/mem_encrypt.c > @@ -31,6 +31,7 @@ > #include <asm/processor-flags.h> > #include <asm/msr.h> > #include <asm/cmdline.h> > +#include <asm/mshyperv.h> > > #include "mm_internal.h" > > @@ -203,7 +204,8 @@ void __init sev_setup_arch(void) > phys_addr_t total_mem = memblock_phys_mem_size(); > unsigned long size; > > - if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) > + if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) > + && !hv_is_isolation_supported()) Are we gonna start sprinkling this hv_is_isolation_supported() check everywhere now? Are those isolation VMs SEV-like guests? Is CC_ATTR_GUEST_MEM_ENCRYPT set on them? What you should do, instead, is add an isol. VM specific hv_cc_platform_has() just like amd_cc_platform_has() and handle the cc_attrs there for your platform, like return false for CC_ATTR_GUEST_MEM_ENCRYPT and then you won't need to add that hv_* thing everywhere. And then fix it up in __set_memory_enc_dec() too.
This doesn't really have much to do with normal DMA mapping, so why does this direct through the dma ops?
On 11/17/2021 3:12 AM, Borislav Petkov wrote: > What you should do, instead, is add an isol. VM specific > hv_cc_platform_has() just like amd_cc_platform_has() and handle > the cc_attrs there for your platform, like return false for > CC_ATTR_GUEST_MEM_ENCRYPT and then you won't need to add that hv_* thing > everywhere. > > And then fix it up in __set_memory_enc_dec() too. > Yes, agree. Will add hv cc_attrs and check via cc_platform_has().
On 11/17/2021 6:01 PM, Christoph Hellwig wrote: > This doesn't really have much to do with normal DMA mapping, > so why does this direct through the dma ops? > According to the previous discussion, dma_alloc_noncontigous() and dma_vmap_noncontiguous() may be used to handle the noncontigous memory alloc/map in the netvsc driver. So add alloc/free and vmap/vunmap callbacks here to handle the case. The previous patch v4 & v5 handles the allocation and map in the netvsc driver. If this should not go though dma ops, We also may make it as vmbus specific function and keep the function in the vmbus driver. https://lkml.org/lkml/2021/9/28/51
On 11/17/2021 10:00 PM, Tianyu Lan wrote: > On 11/17/2021 6:01 PM, Christoph Hellwig wrote: >> This doesn't really have much to do with normal DMA mapping, >> so why does this direct through the dma ops? >> > > According to the previous discussion, dma_alloc_noncontigous() > and dma_vmap_noncontiguous() may be used to handle the noncontigous > memory alloc/map in the netvsc driver. So add alloc/free and vmap/vunmap > callbacks here to handle the case. The previous patch v4 & v5 handles > the allocation and map in the netvsc driver. If this should not go > though dma ops, We also may make it as vmbus specific function and keep > the function in the vmbus driver. > > https://lkml.org/lkml/2021/9/28/51 Hi Christoph: Sorry to bother you. Could you have a look? Which solution do you prefer? If we need to call dma_alloc/map_noncontigous() function in the netvsc driver what patch 4 does. The Hyper-V specific implementation needs to be hided in some callbacks and call these callback in the dma_alloc/map_noncontigous(). I used dma ops here. If the allocation and map operation should be Hyper-V specific function, we may put these functions in the vmbus driver and other vmbus device drivers also may reuse these functions if necessary. Thanks.
On Wed, Nov 17, 2021 at 10:00:08PM +0800, Tianyu Lan wrote: > On 11/17/2021 6:01 PM, Christoph Hellwig wrote: >> This doesn't really have much to do with normal DMA mapping, >> so why does this direct through the dma ops? >> > > According to the previous discussion, dma_alloc_noncontigous() > and dma_vmap_noncontiguous() may be used to handle the noncontigous > memory alloc/map in the netvsc driver. So add alloc/free and vmap/vunmap > callbacks here to handle the case. The previous patch v4 & v5 handles > the allocation and map in the netvsc driver. If this should not go though > dma ops, We also may make it as vmbus specific function and keep > the function in the vmbus driver. But that only makes sense if they can actually use the normal DMA ops. If you implement your own incomplete ops and require to use them you do nothing but adding indirect calls to your fast path and making the code convoluted.
On 11/26/2021 3:40 PM, Christoph Hellwig wrote: > On Wed, Nov 17, 2021 at 10:00:08PM +0800, Tianyu Lan wrote: >> On 11/17/2021 6:01 PM, Christoph Hellwig wrote: >>> This doesn't really have much to do with normal DMA mapping, >>> so why does this direct through the dma ops? >>> >> >> According to the previous discussion, dma_alloc_noncontigous() >> and dma_vmap_noncontiguous() may be used to handle the noncontigous >> memory alloc/map in the netvsc driver. So add alloc/free and vmap/vunmap >> callbacks here to handle the case. The previous patch v4 & v5 handles >> the allocation and map in the netvsc driver. If this should not go though >> dma ops, We also may make it as vmbus specific function and keep >> the function in the vmbus driver. > > But that only makes sense if they can actually use the normal DMA ops. > If you implement your own incomplete ops and require to use them you > do nothing but adding indirect calls to your fast path and making the > code convoluted. > Because the generic part implementation can't meet the netvsc driver requests that allocate 16M memory and map pages via vmap_pfn(). So add Hyperv alloc_noncontiguous and vmap_noncontiguous callbacks. If this is not a right way. we should call these hyper-V functions in the netvsc driver directly, right? Could you have a look at Michael summary about this series we made and give some guides? https://www.mail-archive.com/xen-devel@lists.xenproject.org/msg109284.html Thanks.
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 35487305d8af..65bc385ae07a 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -31,6 +31,7 @@ #include <asm/processor-flags.h> #include <asm/msr.h> #include <asm/cmdline.h> +#include <asm/mshyperv.h> #include "mm_internal.h" @@ -203,7 +204,8 @@ void __init sev_setup_arch(void) phys_addr_t total_mem = memblock_phys_mem_size(); unsigned long size; - if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) + if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) + && !hv_is_isolation_supported()) return; /* diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c index 46df59aeaa06..30fd0600b008 100644 --- a/arch/x86/xen/pci-swiotlb-xen.c +++ b/arch/x86/xen/pci-swiotlb-xen.c @@ -4,6 +4,7 @@ #include <linux/dma-map-ops.h> #include <linux/pci.h> +#include <linux/hyperv.h> #include <xen/swiotlb-xen.h> #include <asm/xen/hypervisor.h> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void) EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late); IOMMU_INIT_FINISH(pci_xen_swiotlb_detect, - NULL, + hyperv_swiotlb_detect, pci_xen_swiotlb_init, NULL); diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index dd12af20e467..d43b4cd88f57 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -9,6 +9,7 @@ config HYPERV select PARAVIRT select X86_HV_CALLBACK_VECTOR if X86 select VMAP_PFN + select DMA_OPS_BYPASS help Select this option to run Linux as a Hyper-V client operating system. diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 392c1ac4f819..32dc193e31cd 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -33,6 +33,7 @@ #include <linux/random.h> #include <linux/kernel.h> #include <linux/syscore_ops.h> +#include <linux/dma-map-ops.h> #include <clocksource/hyperv_timer.h> #include "hyperv_vmbus.h" @@ -2078,6 +2079,7 @@ struct hv_device *vmbus_device_create(const guid_t *type, return child_device_obj; } +static u64 vmbus_dma_mask = DMA_BIT_MASK(64); /* * vmbus_device_register - Register the child device */ @@ -2118,6 +2120,10 @@ int vmbus_device_register(struct hv_device *child_device_obj) } hv_debug_add_dev_dir(child_device_obj); + child_device_obj->device.dma_ops_bypass = true; + child_device_obj->device.dma_ops = &hyperv_iommu_dma_ops; + child_device_obj->device.dma_mask = &vmbus_dma_mask; + child_device_obj->device.dma_parms = &child_device_obj->dma_parms; return 0; err_kset_unregister: diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c index e285a220c913..ebcb628e7e8f 100644 --- a/drivers/iommu/hyperv-iommu.c +++ b/drivers/iommu/hyperv-iommu.c @@ -13,14 +13,21 @@ #include <linux/irq.h> #include <linux/iommu.h> #include <linux/module.h> +#include <linux/hyperv.h> +#include <linux/io.h> #include <asm/apic.h> #include <asm/cpu.h> #include <asm/hw_irq.h> #include <asm/io_apic.h> +#include <asm/iommu.h> +#include <asm/iommu_table.h> #include <asm/irq_remapping.h> #include <asm/hypervisor.h> #include <asm/mshyperv.h> +#include <asm/swiotlb.h> +#include <linux/dma-map-ops.h> +#include <linux/dma-direct.h> #include "irq_remapping.h" @@ -337,4 +344,161 @@ static const struct irq_domain_ops hyperv_root_ir_domain_ops = { .free = hyperv_root_irq_remapping_free, }; +static void __init hyperv_iommu_swiotlb_init(void) +{ + unsigned long hyperv_io_tlb_size; + void *hyperv_io_tlb_start; + + /* + * Allocate Hyper-V swiotlb bounce buffer at early place + * to reserve large contiguous memory. + */ + hyperv_io_tlb_size = swiotlb_size_or_default(); + hyperv_io_tlb_start = memblock_alloc(hyperv_io_tlb_size, PAGE_SIZE); + + if (!hyperv_io_tlb_start) + pr_warn("Fail to allocate Hyper-V swiotlb buffer.\n"); + + swiotlb_init_with_tbl(hyperv_io_tlb_start, + hyperv_io_tlb_size >> IO_TLB_SHIFT, true); +} + +int __init hyperv_swiotlb_detect(void) +{ + if (!hypervisor_is_type(X86_HYPER_MS_HYPERV)) + return 0; + + if (!hv_is_isolation_supported()) + return 0; + + /* + * Enable swiotlb force mode in Isolation VM to + * use swiotlb bounce buffer for dma transaction. + */ + if (hv_isolation_type_snp()) + swiotlb_unencrypted_base = ms_hyperv.shared_gpa_boundary; + swiotlb_force = SWIOTLB_FORCE; + return 1; +} + +static void __init hyperv_iommu_swiotlb_later_init(void) +{ + /* + * Swiotlb bounce buffer needs to be mapped in extra address + * space. Map function doesn't work in the early place and so + * call swiotlb_update_mem_attributes() here. + */ + swiotlb_update_mem_attributes(); +} + +IOMMU_INIT_FINISH(hyperv_swiotlb_detect, + NULL, hyperv_iommu_swiotlb_init, + hyperv_iommu_swiotlb_later_init); + +static struct sg_table *hyperv_dma_alloc_noncontiguous(struct device *dev, + size_t size, enum dma_data_direction dir, gfp_t gfp, + unsigned long attrs) +{ + struct dma_sgt_handle *sh; + struct page **pages; + int num_pages = size >> PAGE_SHIFT; + void *vaddr, *ptr; + int rc, i; + + if (!hv_isolation_type_snp()) + return NULL; + + sh = kmalloc(sizeof(*sh), gfp); + if (!sh) + return NULL; + + vaddr = vmalloc(size); + if (!vaddr) + goto free_sgt; + + pages = kvmalloc_array(num_pages, sizeof(struct page *), + GFP_KERNEL | __GFP_ZERO); + if (!pages) + goto free_mem; + + for (i = 0, ptr = vaddr; i < num_pages; ++i, ptr += PAGE_SIZE) + pages[i] = vmalloc_to_page(ptr); + + rc = sg_alloc_table_from_pages(&sh->sgt, pages, num_pages, 0, size, GFP_KERNEL); + if (rc) + goto free_pages; + + sh->sgt.sgl->dma_address = (dma_addr_t)vaddr; + sh->sgt.sgl->dma_length = size; + sh->pages = pages; + + return &sh->sgt; + +free_pages: + kvfree(pages); +free_mem: + vfree(vaddr); +free_sgt: + kfree(sh); + return NULL; +} + +static void hyperv_dma_free_noncontiguous(struct device *dev, size_t size, + struct sg_table *sgt, enum dma_data_direction dir) +{ + struct dma_sgt_handle *sh = sgt_handle(sgt); + + if (!hv_isolation_type_snp()) + return; + + vfree((void *)sh->sgt.sgl->dma_address); + sg_free_table(&sh->sgt); + kvfree(sh->pages); + kfree(sh); +} + +static void *hyperv_dma_vmap_noncontiguous(struct device *dev, size_t size, + struct sg_table *sgt) +{ + int pg_count = size >> PAGE_SHIFT; + unsigned long *pfns; + struct page **pages = sgt_handle(sgt)->pages; + void *vaddr = NULL; + int i; + + if (!hv_isolation_type_snp()) + return NULL; + + if (!pages) + return NULL; + + pfns = kcalloc(pg_count, sizeof(*pfns), GFP_KERNEL); + if (!pfns) + return NULL; + + for (i = 0; i < pg_count; i++) + pfns[i] = page_to_pfn(pages[i]) + + (ms_hyperv.shared_gpa_boundary >> PAGE_SHIFT); + + vaddr = vmap_pfn(pfns, pg_count, PAGE_KERNEL); + kfree(pfns); + return vaddr; + +} + +static void hyperv_dma_vunmap_noncontiguous(struct device *dev, void *addr) +{ + if (!hv_isolation_type_snp()) + return; + vunmap(addr); +} + +const struct dma_map_ops hyperv_iommu_dma_ops = { + .alloc_noncontiguous = hyperv_dma_alloc_noncontiguous, + .free_noncontiguous = hyperv_dma_free_noncontiguous, + .vmap_noncontiguous = hyperv_dma_vmap_noncontiguous, + .vunmap_noncontiguous = hyperv_dma_vunmap_noncontiguous, +}; +EXPORT_SYMBOL_GPL(hyperv_iommu_dma_ops); + #endif diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index b823311eac79..4d44fb3b3f1c 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1726,6 +1726,16 @@ int hyperv_write_cfg_blk(struct pci_dev *dev, void *buf, unsigned int len, int hyperv_reg_block_invalidate(struct pci_dev *dev, void *context, void (*block_invalidate)(void *context, u64 block_mask)); +#ifdef CONFIG_HYPERV +int __init hyperv_swiotlb_detect(void); +#else +static inline int __init hyperv_swiotlb_detect(void) +{ + return 0; +} +#endif + +extern const struct dma_map_ops hyperv_iommu_dma_ops; struct hyperv_pci_block_ops { int (*read_block)(struct pci_dev *dev, void *buf, unsigned int buf_len,