Message ID | 004feaef-f3bb-e4bb-fb10-f205a9f69f28@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | swiotlb-xen: fixes and adjustments | expand |
On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote: > The code is unreachable for HVM or PVH, and it also makes little sense > in auto-translated environments. On Arm, with > xen_{create,destroy}_contiguous_region() both being stubs, I have a hard > time seeing what good the Xen specific variant does - the generic one > ought to be fine for all purposes there. Still Arm code explicitly > references symbols here, so the code will continue to be included there. Can the Xen/arm folks look into that? Getting ARM out of using swiotlb-xen would be a huge step forward cleaning up some DMA APIs. > > Instead of making PCI_XEN's "select" conditional, simply drop it - > SWIOTLB_XEN will be available unconditionally in the PV case anyway, and > is - as explained above - dead code in non-PV environments. > > This in turn allows dropping the stubs for > xen_{create,destroy}_contiguous_region(), the former of which was broken > anyway - it failed to set the DMA handle output. Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Wed, 8 Sep 2021, Christoph Hellwig wrote: > On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote: > > The code is unreachable for HVM or PVH, and it also makes little sense > > in auto-translated environments. On Arm, with > > xen_{create,destroy}_contiguous_region() both being stubs, I have a hard > > time seeing what good the Xen specific variant does - the generic one > > ought to be fine for all purposes there. Still Arm code explicitly > > references symbols here, so the code will continue to be included there. > > Can the Xen/arm folks look into that? Getting ARM out of using > swiotlb-xen would be a huge step forward cleaning up some DMA APIs. On ARM swiotlb-xen is used for a different purpose compared to x86. Many ARM SoCs still don't have an IOMMU covering all DMA-mastering devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest physical == physical address). Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need swiotlb-xen. But when we start using PV drivers to share the network or disk between Dom0 and DomU we are going to get DomU pages mapped in Dom0, we call them "foreign pages". They are not mapped 1:1. It can happen that one of these foreign pages are used for DMA operations (e.g. related to the NIC). swiotlb-xen is used to detect these situations and translate the guest physical address to physical address of foreign pages appropriately. If an IOMMU is available and the DMA-mastering device is behind it, then swiotlb-xen is not necessary. FYI there is community interest in selectively disabling swiotlb-xen for devices that are behind an IOMMU. > > Instead of making PCI_XEN's "select" conditional, simply drop it - > > SWIOTLB_XEN will be available unconditionally in the PV case anyway, and > > is - as explained above - dead code in non-PV environments. > > > > This in turn allows dropping the stubs for > > xen_{create,destroy}_contiguous_region(), the former of which was broken > > anyway - it failed to set the DMA handle output. > > Looks good: > > Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
On 11.09.2021 01:48, Stefano Stabellini wrote: > On Wed, 8 Sep 2021, Christoph Hellwig wrote: >> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote: >>> The code is unreachable for HVM or PVH, and it also makes little sense >>> in auto-translated environments. On Arm, with >>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard >>> time seeing what good the Xen specific variant does - the generic one >>> ought to be fine for all purposes there. Still Arm code explicitly >>> references symbols here, so the code will continue to be included there. >> >> Can the Xen/arm folks look into that? Getting ARM out of using >> swiotlb-xen would be a huge step forward cleaning up some DMA APIs. > > On ARM swiotlb-xen is used for a different purpose compared to x86. > > Many ARM SoCs still don't have an IOMMU covering all DMA-mastering > devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest > physical == physical address). > > Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need > swiotlb-xen. But when we start using PV drivers to share the network or > disk between Dom0 and DomU we are going to get DomU pages mapped in > Dom0, we call them "foreign pages". They are not mapped 1:1. It can > happen that one of these foreign pages are used for DMA operations > (e.g. related to the NIC). swiotlb-xen is used to detect these > situations and translate the guest physical address to physical address > of foreign pages appropriately. Hmm, you say "translate", which isn't my understanding of swiotlb's purpose. As per my understanding swiotlb instead double buffers data such that is becomes accessible, or suitably arranges underlying machine addresses. The latter part is clearly a PV-only thing, unused by Arm as can be seen by there not being any use of XENMEM_exchange. So it must be the former part that you're talking about, but that's also the purpose of the non-Xen swiotlb code. If only for my own education and understanding, could you point me at the difference between swiotlb-xen and generic swiotlb which addresses this specific aspect of Arm behavior? Jan
On 11.09.2021 01:48, Stefano Stabellini wrote: > On Wed, 8 Sep 2021, Christoph Hellwig wrote: >> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote: >>> The code is unreachable for HVM or PVH, and it also makes little sense >>> in auto-translated environments. On Arm, with >>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard >>> time seeing what good the Xen specific variant does - the generic one >>> ought to be fine for all purposes there. Still Arm code explicitly >>> references symbols here, so the code will continue to be included there. >> >> Can the Xen/arm folks look into that? Getting ARM out of using >> swiotlb-xen would be a huge step forward cleaning up some DMA APIs. > > On ARM swiotlb-xen is used for a different purpose compared to x86. > > Many ARM SoCs still don't have an IOMMU covering all DMA-mastering > devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest > physical == physical address). > > Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need > swiotlb-xen. But when we start using PV drivers to share the network or > disk between Dom0 and DomU we are going to get DomU pages mapped in > Dom0, we call them "foreign pages". They are not mapped 1:1. It can > happen that one of these foreign pages are used for DMA operations > (e.g. related to the NIC). swiotlb-xen is used to detect these > situations and translate the guest physical address to physical address > of foreign pages appropriately. Thinking about this some more - if Dom0 is 1:1 mapped, why don't you map foreign pages 1:1 as well then? >>> Instead of making PCI_XEN's "select" conditional, simply drop it - >>> SWIOTLB_XEN will be available unconditionally in the PV case anyway, and >>> is - as explained above - dead code in non-PV environments. >>> >>> This in turn allows dropping the stubs for >>> xen_{create,destroy}_contiguous_region(), the former of which was broken >>> anyway - it failed to set the DMA handle output. >> >> Looks good: >> >> Reviewed-by: Christoph Hellwig <hch@lst.de> > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Thanks for this and the other reviews. Jan
On Mon, 13 Sep 2021, Jan Beulich wrote: > On 11.09.2021 01:48, Stefano Stabellini wrote: > > On Wed, 8 Sep 2021, Christoph Hellwig wrote: > >> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote: > >>> The code is unreachable for HVM or PVH, and it also makes little sense > >>> in auto-translated environments. On Arm, with > >>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard > >>> time seeing what good the Xen specific variant does - the generic one > >>> ought to be fine for all purposes there. Still Arm code explicitly > >>> references symbols here, so the code will continue to be included there. > >> > >> Can the Xen/arm folks look into that? Getting ARM out of using > >> swiotlb-xen would be a huge step forward cleaning up some DMA APIs. > > > > On ARM swiotlb-xen is used for a different purpose compared to x86. > > > > Many ARM SoCs still don't have an IOMMU covering all DMA-mastering > > devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest > > physical == physical address). > > > > Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need > > swiotlb-xen. But when we start using PV drivers to share the network or > > disk between Dom0 and DomU we are going to get DomU pages mapped in > > Dom0, we call them "foreign pages". They are not mapped 1:1. It can > > happen that one of these foreign pages are used for DMA operations > > (e.g. related to the NIC). swiotlb-xen is used to detect these > > situations and translate the guest physical address to physical address > > of foreign pages appropriately. > > Thinking about this some more - if Dom0 is 1:1 mapped, why don't you > map foreign pages 1:1 as well then? That's because the foreign page, from Linux POV, would appear out of thin air. It would just show up in a region not considered memory just few moments before, so there would be no memblock, no struct page, nothing. At least in the past that caused serious issues to the kernel. This is the reason why the kernel is using ballooned-out pages to map foreign pages even on x86: drivers/block/xen-blkback/blkback.c:xen_blkbk_map -> gnttab_page_cache_get drivers/xen/grant-table.c:gnttab_page_cache_get drivers/xen/grant-table.c:gnttab_alloc_pages drivers/xen/unpopulated-alloc.c:xen_alloc_unpopulated_pages drivers/xen/balloon.c:alloc_xenballooned_pages
On Mon, 13 Sep 2021, Jan Beulich wrote: > On 11.09.2021 01:48, Stefano Stabellini wrote: > > On Wed, 8 Sep 2021, Christoph Hellwig wrote: > >> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote: > >>> The code is unreachable for HVM or PVH, and it also makes little sense > >>> in auto-translated environments. On Arm, with > >>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard > >>> time seeing what good the Xen specific variant does - the generic one > >>> ought to be fine for all purposes there. Still Arm code explicitly > >>> references symbols here, so the code will continue to be included there. > >> > >> Can the Xen/arm folks look into that? Getting ARM out of using > >> swiotlb-xen would be a huge step forward cleaning up some DMA APIs. > > > > On ARM swiotlb-xen is used for a different purpose compared to x86. > > > > Many ARM SoCs still don't have an IOMMU covering all DMA-mastering > > devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest > > physical == physical address). > > > > Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need > > swiotlb-xen. But when we start using PV drivers to share the network or > > disk between Dom0 and DomU we are going to get DomU pages mapped in > > Dom0, we call them "foreign pages". They are not mapped 1:1. It can > > happen that one of these foreign pages are used for DMA operations > > (e.g. related to the NIC). swiotlb-xen is used to detect these > > situations and translate the guest physical address to physical address > > of foreign pages appropriately. > > Hmm, you say "translate", which isn't my understanding of swiotlb's > purpose. As per my understanding swiotlb instead double buffers data > such that is becomes accessible, or suitably arranges underlying > machine addresses. The latter part is clearly a PV-only thing, unused > by Arm as can be seen by there not being any use of XENMEM_exchange. > So it must be the former part that you're talking about, but that's > also the purpose of the non-Xen swiotlb code. If only for my own > education and understanding, could you point me at the difference > between swiotlb-xen and generic swiotlb which addresses this specific > aspect of Arm behavior? If you look at xen_swiotlb_map_page, you'll see the call to xen_phys_to_dma which eventually calls arch/arm/xen/p2m.c:__pfn_to_mfn. If everything goes well and we only need to do translation we'll "goto done". Otherwise, we'll fall back on a swiotlb buffer with swiotlb_tbl_map_single, the result of which also needs to be translated, see the second call to xen_phys_to_dma.
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2605,7 +2605,6 @@ config PCI_OLPC config PCI_XEN def_bool y depends on PCI && XEN - select SWIOTLB_XEN config MMCONF_FAM10H def_bool y --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -177,6 +177,7 @@ config XEN_GRANT_DMA_ALLOC config SWIOTLB_XEN def_bool y + depends on XEN_PV || ARM || ARM64 select DMA_OPS select SWIOTLB --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -46,19 +46,7 @@ extern unsigned long *xen_contiguous_bit int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, unsigned int address_bits, dma_addr_t *dma_handle); - void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order); -#else -static inline int xen_create_contiguous_region(phys_addr_t pstart, - unsigned int order, - unsigned int address_bits, - dma_addr_t *dma_handle) -{ - return 0; -} - -static inline void xen_destroy_contiguous_region(phys_addr_t pstart, - unsigned int order) { } #endif #if defined(CONFIG_XEN_PV)
The code is unreachable for HVM or PVH, and it also makes little sense in auto-translated environments. On Arm, with xen_{create,destroy}_contiguous_region() both being stubs, I have a hard time seeing what good the Xen specific variant does - the generic one ought to be fine for all purposes there. Still Arm code explicitly references symbols here, so the code will continue to be included there. Instead of making PCI_XEN's "select" conditional, simply drop it - SWIOTLB_XEN will be available unconditionally in the PV case anyway, and is - as explained above - dead code in non-PV environments. This in turn allows dropping the stubs for xen_{create,destroy}_contiguous_region(), the former of which was broken anyway - it failed to set the DMA handle output. Signed-off-by: Jan Beulich <jbeulich@suse.com>