diff mbox series

[12/12] swiotlb-xen: this is PV-only on x86

Message ID 004feaef-f3bb-e4bb-fb10-f205a9f69f28@suse.com (mailing list archive)
State New, archived
Headers show
Series swiotlb-xen: fixes and adjustments | expand

Commit Message

Jan Beulich Sept. 7, 2021, 12:13 p.m. UTC
The code is unreachable for HVM or PVH, and it also makes little sense
in auto-translated environments. On Arm, with
xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
time seeing what good the Xen specific variant does - the generic one
ought to be fine for all purposes there. Still Arm code explicitly
references symbols here, so the code will continue to be included there.

Instead of making PCI_XEN's "select" conditional, simply drop it -
SWIOTLB_XEN will be available unconditionally in the PV case anyway, and
is - as explained above - dead code in non-PV environments.

This in turn allows dropping the stubs for
xen_{create,destroy}_contiguous_region(), the former of which was broken
anyway - it failed to set the DMA handle output.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Christoph Hellwig Sept. 8, 2021, 7:14 a.m. UTC | #1
On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote:
> The code is unreachable for HVM or PVH, and it also makes little sense
> in auto-translated environments. On Arm, with
> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
> time seeing what good the Xen specific variant does - the generic one
> ought to be fine for all purposes there. Still Arm code explicitly
> references symbols here, so the code will continue to be included there.

Can the Xen/arm folks look into that?  Getting ARM out of using
swiotlb-xen would be a huge step forward cleaning up some DMA APIs.

> 
> Instead of making PCI_XEN's "select" conditional, simply drop it -
> SWIOTLB_XEN will be available unconditionally in the PV case anyway, and
> is - as explained above - dead code in non-PV environments.
> 
> This in turn allows dropping the stubs for
> xen_{create,destroy}_contiguous_region(), the former of which was broken
> anyway - it failed to set the DMA handle output.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Stefano Stabellini Sept. 10, 2021, 11:48 p.m. UTC | #2
On Wed, 8 Sep 2021, Christoph Hellwig wrote:
> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote:
> > The code is unreachable for HVM or PVH, and it also makes little sense
> > in auto-translated environments. On Arm, with
> > xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
> > time seeing what good the Xen specific variant does - the generic one
> > ought to be fine for all purposes there. Still Arm code explicitly
> > references symbols here, so the code will continue to be included there.
> 
> Can the Xen/arm folks look into that?  Getting ARM out of using
> swiotlb-xen would be a huge step forward cleaning up some DMA APIs.

On ARM swiotlb-xen is used for a different purpose compared to x86.

Many ARM SoCs still don't have an IOMMU covering all DMA-mastering
devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest
physical == physical address).

Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need
swiotlb-xen. But when we start using PV drivers to share the network or
disk between Dom0 and DomU we are going to get DomU pages mapped in
Dom0, we call them "foreign pages".  They are not mapped 1:1. It can
happen that one of these foreign pages are used for DMA operations
(e.g. related to the NIC). swiotlb-xen is used to detect these
situations and translate the guest physical address to physical address
of foreign pages appropriately.

If an IOMMU is available and the DMA-mastering device is behind it, then
swiotlb-xen is not necessary. FYI there is community interest in
selectively disabling swiotlb-xen for devices that are behind an IOMMU.


> > Instead of making PCI_XEN's "select" conditional, simply drop it -
> > SWIOTLB_XEN will be available unconditionally in the PV case anyway, and
> > is - as explained above - dead code in non-PV environments.
> > 
> > This in turn allows dropping the stubs for
> > xen_{create,destroy}_contiguous_region(), the former of which was broken
> > anyway - it failed to set the DMA handle output.
> 
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Jan Beulich Sept. 13, 2021, 7:28 a.m. UTC | #3
On 11.09.2021 01:48, Stefano Stabellini wrote:
> On Wed, 8 Sep 2021, Christoph Hellwig wrote:
>> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote:
>>> The code is unreachable for HVM or PVH, and it also makes little sense
>>> in auto-translated environments. On Arm, with
>>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
>>> time seeing what good the Xen specific variant does - the generic one
>>> ought to be fine for all purposes there. Still Arm code explicitly
>>> references symbols here, so the code will continue to be included there.
>>
>> Can the Xen/arm folks look into that?  Getting ARM out of using
>> swiotlb-xen would be a huge step forward cleaning up some DMA APIs.
> 
> On ARM swiotlb-xen is used for a different purpose compared to x86.
> 
> Many ARM SoCs still don't have an IOMMU covering all DMA-mastering
> devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest
> physical == physical address).
> 
> Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need
> swiotlb-xen. But when we start using PV drivers to share the network or
> disk between Dom0 and DomU we are going to get DomU pages mapped in
> Dom0, we call them "foreign pages".  They are not mapped 1:1. It can
> happen that one of these foreign pages are used for DMA operations
> (e.g. related to the NIC). swiotlb-xen is used to detect these
> situations and translate the guest physical address to physical address
> of foreign pages appropriately.

Hmm, you say "translate", which isn't my understanding of swiotlb's
purpose. As per my understanding swiotlb instead double buffers data
such that is becomes accessible, or suitably arranges underlying
machine addresses. The latter part is clearly a PV-only thing, unused
by Arm as can be seen by there not being any use of XENMEM_exchange.
So it must be the former part that you're talking about, but that's
also the purpose of the non-Xen swiotlb code. If only for my own
education and understanding, could you point me at the difference
between swiotlb-xen and generic swiotlb which addresses this specific
aspect of Arm behavior?

Jan
Jan Beulich Sept. 13, 2021, 7:51 a.m. UTC | #4
On 11.09.2021 01:48, Stefano Stabellini wrote:
> On Wed, 8 Sep 2021, Christoph Hellwig wrote:
>> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote:
>>> The code is unreachable for HVM or PVH, and it also makes little sense
>>> in auto-translated environments. On Arm, with
>>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
>>> time seeing what good the Xen specific variant does - the generic one
>>> ought to be fine for all purposes there. Still Arm code explicitly
>>> references symbols here, so the code will continue to be included there.
>>
>> Can the Xen/arm folks look into that?  Getting ARM out of using
>> swiotlb-xen would be a huge step forward cleaning up some DMA APIs.
> 
> On ARM swiotlb-xen is used for a different purpose compared to x86.
> 
> Many ARM SoCs still don't have an IOMMU covering all DMA-mastering
> devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest
> physical == physical address).
> 
> Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need
> swiotlb-xen. But when we start using PV drivers to share the network or
> disk between Dom0 and DomU we are going to get DomU pages mapped in
> Dom0, we call them "foreign pages".  They are not mapped 1:1. It can
> happen that one of these foreign pages are used for DMA operations
> (e.g. related to the NIC). swiotlb-xen is used to detect these
> situations and translate the guest physical address to physical address
> of foreign pages appropriately.

Thinking about this some more - if Dom0 is 1:1 mapped, why don't you
map foreign pages 1:1 as well then?

>>> Instead of making PCI_XEN's "select" conditional, simply drop it -
>>> SWIOTLB_XEN will be available unconditionally in the PV case anyway, and
>>> is - as explained above - dead code in non-PV environments.
>>>
>>> This in turn allows dropping the stubs for
>>> xen_{create,destroy}_contiguous_region(), the former of which was broken
>>> anyway - it failed to set the DMA handle output.
>>
>> Looks good:
>>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Thanks for this and the other reviews.

Jan
Stefano Stabellini Sept. 13, 2021, 8:49 p.m. UTC | #5
On Mon, 13 Sep 2021, Jan Beulich wrote:
> On 11.09.2021 01:48, Stefano Stabellini wrote:
> > On Wed, 8 Sep 2021, Christoph Hellwig wrote:
> >> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote:
> >>> The code is unreachable for HVM or PVH, and it also makes little sense
> >>> in auto-translated environments. On Arm, with
> >>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
> >>> time seeing what good the Xen specific variant does - the generic one
> >>> ought to be fine for all purposes there. Still Arm code explicitly
> >>> references symbols here, so the code will continue to be included there.
> >>
> >> Can the Xen/arm folks look into that?  Getting ARM out of using
> >> swiotlb-xen would be a huge step forward cleaning up some DMA APIs.
> > 
> > On ARM swiotlb-xen is used for a different purpose compared to x86.
> > 
> > Many ARM SoCs still don't have an IOMMU covering all DMA-mastering
> > devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest
> > physical == physical address).
> > 
> > Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need
> > swiotlb-xen. But when we start using PV drivers to share the network or
> > disk between Dom0 and DomU we are going to get DomU pages mapped in
> > Dom0, we call them "foreign pages".  They are not mapped 1:1. It can
> > happen that one of these foreign pages are used for DMA operations
> > (e.g. related to the NIC). swiotlb-xen is used to detect these
> > situations and translate the guest physical address to physical address
> > of foreign pages appropriately.
> 
> Thinking about this some more - if Dom0 is 1:1 mapped, why don't you
> map foreign pages 1:1 as well then?

That's because the foreign page, from Linux POV, would appear out of
thin air. It would just show up in a region not considered memory just
few moments before, so there would be no memblock, no struct page,
nothing. At least in the past that caused serious issues to the kernel.

This is the reason why the kernel is using ballooned-out pages to map
foreign pages even on x86:

drivers/block/xen-blkback/blkback.c:xen_blkbk_map -> gnttab_page_cache_get
drivers/xen/grant-table.c:gnttab_page_cache_get
drivers/xen/grant-table.c:gnttab_alloc_pages
drivers/xen/unpopulated-alloc.c:xen_alloc_unpopulated_pages
drivers/xen/balloon.c:alloc_xenballooned_pages
Stefano Stabellini Sept. 13, 2021, 8:54 p.m. UTC | #6
On Mon, 13 Sep 2021, Jan Beulich wrote:
> On 11.09.2021 01:48, Stefano Stabellini wrote:
> > On Wed, 8 Sep 2021, Christoph Hellwig wrote:
> >> On Tue, Sep 07, 2021 at 02:13:21PM +0200, Jan Beulich wrote:
> >>> The code is unreachable for HVM or PVH, and it also makes little sense
> >>> in auto-translated environments. On Arm, with
> >>> xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
> >>> time seeing what good the Xen specific variant does - the generic one
> >>> ought to be fine for all purposes there. Still Arm code explicitly
> >>> references symbols here, so the code will continue to be included there.
> >>
> >> Can the Xen/arm folks look into that?  Getting ARM out of using
> >> swiotlb-xen would be a huge step forward cleaning up some DMA APIs.
> > 
> > On ARM swiotlb-xen is used for a different purpose compared to x86.
> > 
> > Many ARM SoCs still don't have an IOMMU covering all DMA-mastering
> > devices (e.g. Raspberry Pi 4). As a consequence we map Dom0 1:1 (guest
> > physical == physical address).
> > 
> > Now if it was just for Dom0, thanks to the 1:1 mapping, we wouldn't need
> > swiotlb-xen. But when we start using PV drivers to share the network or
> > disk between Dom0 and DomU we are going to get DomU pages mapped in
> > Dom0, we call them "foreign pages".  They are not mapped 1:1. It can
> > happen that one of these foreign pages are used for DMA operations
> > (e.g. related to the NIC). swiotlb-xen is used to detect these
> > situations and translate the guest physical address to physical address
> > of foreign pages appropriately.
> 
> Hmm, you say "translate", which isn't my understanding of swiotlb's
> purpose. As per my understanding swiotlb instead double buffers data
> such that is becomes accessible, or suitably arranges underlying
> machine addresses. The latter part is clearly a PV-only thing, unused
> by Arm as can be seen by there not being any use of XENMEM_exchange.
> So it must be the former part that you're talking about, but that's
> also the purpose of the non-Xen swiotlb code. If only for my own
> education and understanding, could you point me at the difference
> between swiotlb-xen and generic swiotlb which addresses this specific
> aspect of Arm behavior?

If you look at xen_swiotlb_map_page, you'll see the call to
xen_phys_to_dma which eventually calls arch/arm/xen/p2m.c:__pfn_to_mfn.
If everything goes well and we only need to do translation we'll "goto
done". Otherwise, we'll fall back on a swiotlb buffer with
swiotlb_tbl_map_single, the result of which also needs to be translated,
see the second call to xen_phys_to_dma.
diff mbox series

Patch

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2605,7 +2605,6 @@  config PCI_OLPC
 config PCI_XEN
 	def_bool y
 	depends on PCI && XEN
-	select SWIOTLB_XEN
 
 config MMCONF_FAM10H
 	def_bool y
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -177,6 +177,7 @@  config XEN_GRANT_DMA_ALLOC
 
 config SWIOTLB_XEN
 	def_bool y
+	depends on XEN_PV || ARM || ARM64
 	select DMA_OPS
 	select SWIOTLB
 
--- a/include/xen/xen-ops.h
+++ b/include/xen/xen-ops.h
@@ -46,19 +46,7 @@  extern unsigned long *xen_contiguous_bit
 int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
 				unsigned int address_bits,
 				dma_addr_t *dma_handle);
-
 void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order);
-#else
-static inline int xen_create_contiguous_region(phys_addr_t pstart,
-					       unsigned int order,
-					       unsigned int address_bits,
-					       dma_addr_t *dma_handle)
-{
-	return 0;
-}
-
-static inline void xen_destroy_contiguous_region(phys_addr_t pstart,
-						 unsigned int order) { }
 #endif
 
 #if defined(CONFIG_XEN_PV)