Message ID | 20190121050056.14325-1-peng.fan@nxp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] virtio_ring: check dma_mem for xen_domain | expand |
On Mon, Jan 21, 2019 at 04:51:57AM +0000, Peng Fan wrote: > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed > address as the dma mem buffer which is predefined. > > Without this patch, the flow is: > vring_map_one_sg -> vring_use_dma_api > -> dma_map_page > -> __swiotlb_map_page > ->swiotlb_map_page > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir); > However we are using per device dma area for rpmsg, phys_to_virt > could not return a correct virtual address for virtual address in > vmalloc area. Then kernel panic. And that is the right thing to do. You must not call dma_map_* on memory that was allocated from dma_alloc_*. We actually have another thread which appears to be for this same issue.
Hi > -----Original Message----- > From: hch@infradead.org [mailto:hch@infradead.org] > Sent: 2019年1月21日 16:29 > To: Peng Fan <peng.fan@nxp.com> > Cc: mst@redhat.com; jasowang@redhat.com; sstabellini@kernel.org; > hch@infradead.org; virtualization@lists.linux-foundation.org; > xen-devel@lists.xenproject.org; linux-kernel@vger.kernel.org; > linux-remoteproc@vger.kernel.org > Subject: Re: [RFC] virtio_ring: check dma_mem for xen_domain > > On Mon, Jan 21, 2019 at 04:51:57AM +0000, Peng Fan wrote: > > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed > > address as the dma mem buffer which is predefined. > > > > Without this patch, the flow is: > > vring_map_one_sg -> vring_use_dma_api > > -> dma_map_page > > -> __swiotlb_map_page > > ->swiotlb_map_page > > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, > dev_addr)), size, > > dir); However we are using per device dma area for rpmsg, phys_to_virt > > could not return a correct virtual address for virtual address in > > vmalloc area. Then kernel panic. > > And that is the right thing to do. You must not call dma_map_* on memory > that was allocated from dma_alloc_*. Understand. But the current code is that vring_use_dma_api will always return true, if the current OS is a xen VM. Actually it needs to return false for my case, then we could use sg_phys(sg) to get the correct physical address. > > We actually have another thread which appears to be for this same issue. You mean https://patchwork.kernel.org/patch/10742923/ ? You suggest use cma there, but vring_use_dma_api will still return true if the OS is running on xen. Then vring_map_one_sg will still runs into __dma_map_area. In my case, just need vring_use_dma_api to return false and use sg_phys(sg) to get the correct physical address, whether per dma reserved area or per device cma. Thanks, Peng.
On Mon, Jan 21, 2019 at 12:28:30AM -0800, hch@infradead.org wrote: > On Mon, Jan 21, 2019 at 04:51:57AM +0000, Peng Fan wrote: > > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed > > address as the dma mem buffer which is predefined. > > > > Without this patch, the flow is: > > vring_map_one_sg -> vring_use_dma_api > > -> dma_map_page > > -> __swiotlb_map_page > > ->swiotlb_map_page > > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir); > > However we are using per device dma area for rpmsg, phys_to_virt > > could not return a correct virtual address for virtual address in > > vmalloc area. Then kernel panic. > > And that is the right thing to do. You must not call dma_map_* on > memory that was allocated from dma_alloc_*. > > We actually have another thread which appears to be for this same issue. Sorry, which thread do you refer to?
On Mon, 21 Jan 2019, Peng Fan wrote: > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed > address as the dma mem buffer which is predefined. > > Without this patch, the flow is: > vring_map_one_sg -> vring_use_dma_api > -> dma_map_page > -> __swiotlb_map_page > ->swiotlb_map_page > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir); > However we are using per device dma area for rpmsg, phys_to_virt > could not return a correct virtual address for virtual address in > vmalloc area. Then kernel panic. > > With this patch, vring_use_dma_api will return false, and > vring_map_one_sg will return sg_phys(sg) which is the correct phys > address in the predefined memory region. > vring_map_one_sg -> vring_use_dma_api > -> sg_phys(sg) > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > --- > drivers/virtio/virtio_ring.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755484e3..8993d7cb3592 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -248,6 +248,8 @@ static inline bool virtqueue_use_indirect(struct virtqueue *_vq, > > static bool vring_use_dma_api(struct virtio_device *vdev) > { > + struct device *dma_dev = vdev->dev.parent; > + > if (!virtio_has_iommu_quirk(vdev)) > return true; > > @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > */ > - if (xen_domain()) > + if (xen_domain() && !dma_dev->dma_mem) > return true; > > return false; I can see you spotted a real issue, but this is not the right fix. We just need something a bit more flexible than xen_domain(): there are many kinds of Xen domains on different architectures, we basically want to enable this (return true from vring_use_dma_api) only when the xen swiotlb is meant to be used. Does the appended patch fix the issue you have? --- xen: introduce xen_vring_use_dma From: Stefano Stabellini <stefanos@xilinx.com> Export xen_swiotlb on arm and arm64. Use xen_swiotlb to determine when vring should use dma APIs to map the ring: when xen_swiotlb is enabled the dma API is required. When it is disabled, it is not required. Reported-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> diff --git a/arch/arm/include/asm/xen/swiotlb-xen.h b/arch/arm/include/asm/xen/swiotlb-xen.h new file mode 100644 index 0000000..455ade5 --- /dev/null +++ b/arch/arm/include/asm/xen/swiotlb-xen.h @@ -0,0 +1 @@ +#include <xen/arm/swiotlb-xen.h> diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c index cb44aa2..8592863 100644 --- a/arch/arm/xen/mm.c +++ b/arch/arm/xen/mm.c @@ -21,6 +21,8 @@ #include <asm/xen/hypercall.h> #include <asm/xen/interface.h> +int xen_swiotlb __read_mostly; + unsigned long xen_get_swiotlb_free_pages(unsigned int order) { struct memblock_region *reg; @@ -189,6 +191,7 @@ int __init xen_mm_init(void) struct gnttab_cache_flush cflush; if (!xen_initial_domain()) return 0; + xen_swiotlb = 1; xen_swiotlb_init(1, false); xen_dma_ops = &xen_swiotlb_dma_ops; diff --git a/arch/arm64/include/asm/xen/swiotlb-xen.h b/arch/arm64/include/asm/xen/swiotlb-xen.h new file mode 100644 index 0000000..455ade5 --- /dev/null +++ b/arch/arm64/include/asm/xen/swiotlb-xen.h @@ -0,0 +1 @@ +#include <xen/arm/swiotlb-xen.h> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cd7e755..bf8badc 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -260,7 +260,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) * the DMA API if we're a Xen guest, which at least allows * all of the sensible Xen configurations to work correctly. */ - if (xen_domain()) + if (xen_vring_use_dma()) return true; return false; diff --git a/include/xen/arm/swiotlb-xen.h b/include/xen/arm/swiotlb-xen.h new file mode 100644 index 0000000..2aac7c4 --- /dev/null +++ b/include/xen/arm/swiotlb-xen.h @@ -0,0 +1,10 @@ +#ifndef _ASM_ARM_XEN_SWIOTLB_XEN_H +#define _ASM_ARM_XEN_SWIOTLB_XEN_H + +#ifdef CONFIG_SWIOTLB_XEN +extern int xen_swiotlb; +#else +#define xen_swiotlb (0) +#endif + +#endif diff --git a/include/xen/xen.h b/include/xen/xen.h index 0e21567..74a536d 100644 --- a/include/xen/xen.h +++ b/include/xen/xen.h @@ -46,4 +46,10 @@ enum xen_domain_type { bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, const struct bio_vec *vec2); +#include <asm/xen/swiotlb-xen.h> +static inline int xen_vring_use_dma(void) +{ + return !!xen_swiotlb; +} + #endif /* _XEN_XEN_H */
On Tue, Jan 22, 2019 at 11:59:31AM -0800, Stefano Stabellini wrote: > On Mon, 21 Jan 2019, Peng Fan wrote: > > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed > > address as the dma mem buffer which is predefined. > > > > Without this patch, the flow is: > > vring_map_one_sg -> vring_use_dma_api > > -> dma_map_page > > -> __swiotlb_map_page > > ->swiotlb_map_page > > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir); > > However we are using per device dma area for rpmsg, phys_to_virt > > could not return a correct virtual address for virtual address in > > vmalloc area. Then kernel panic. > > > > With this patch, vring_use_dma_api will return false, and > > vring_map_one_sg will return sg_phys(sg) which is the correct phys > > address in the predefined memory region. > > vring_map_one_sg -> vring_use_dma_api > > -> sg_phys(sg) > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > --- > > drivers/virtio/virtio_ring.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > index cd7e755484e3..8993d7cb3592 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -248,6 +248,8 @@ static inline bool virtqueue_use_indirect(struct virtqueue *_vq, > > > > static bool vring_use_dma_api(struct virtio_device *vdev) > > { > > + struct device *dma_dev = vdev->dev.parent; > > + > > if (!virtio_has_iommu_quirk(vdev)) > > return true; > > > > @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > */ > > - if (xen_domain()) > > + if (xen_domain() && !dma_dev->dma_mem) > > return true; > > > > return false; > > I can see you spotted a real issue, but this is not the right fix. We > just need something a bit more flexible than xen_domain(): there are > many kinds of Xen domains on different architectures, we basically want > to enable this (return true from vring_use_dma_api) only when the xen > swiotlb is meant to be used. Does the appended patch fix the issue you > have? > > --- > > xen: introduce xen_vring_use_dma > > From: Stefano Stabellini <stefanos@xilinx.com> > > Export xen_swiotlb on arm and arm64. > > Use xen_swiotlb to determine when vring should use dma APIs to map the > ring: when xen_swiotlb is enabled the dma API is required. When it is > disabled, it is not required. > > Reported-by: Peng Fan <peng.fan@nxp.com> > Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> > > diff --git a/arch/arm/include/asm/xen/swiotlb-xen.h b/arch/arm/include/asm/xen/swiotlb-xen.h > new file mode 100644 > index 0000000..455ade5 > --- /dev/null > +++ b/arch/arm/include/asm/xen/swiotlb-xen.h > @@ -0,0 +1 @@ > +#include <xen/arm/swiotlb-xen.h> > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c > index cb44aa2..8592863 100644 > --- a/arch/arm/xen/mm.c > +++ b/arch/arm/xen/mm.c > @@ -21,6 +21,8 @@ > #include <asm/xen/hypercall.h> > #include <asm/xen/interface.h> > > +int xen_swiotlb __read_mostly; > + > unsigned long xen_get_swiotlb_free_pages(unsigned int order) > { > struct memblock_region *reg; > @@ -189,6 +191,7 @@ int __init xen_mm_init(void) > struct gnttab_cache_flush cflush; > if (!xen_initial_domain()) > return 0; > + xen_swiotlb = 1; > xen_swiotlb_init(1, false); > xen_dma_ops = &xen_swiotlb_dma_ops; > > diff --git a/arch/arm64/include/asm/xen/swiotlb-xen.h b/arch/arm64/include/asm/xen/swiotlb-xen.h > new file mode 100644 > index 0000000..455ade5 > --- /dev/null > +++ b/arch/arm64/include/asm/xen/swiotlb-xen.h > @@ -0,0 +1 @@ > +#include <xen/arm/swiotlb-xen.h> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index cd7e755..bf8badc 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -260,7 +260,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > */ > - if (xen_domain()) > + if (xen_vring_use_dma()) > return true; > > return false; > diff --git a/include/xen/arm/swiotlb-xen.h b/include/xen/arm/swiotlb-xen.h > new file mode 100644 > index 0000000..2aac7c4 > --- /dev/null > +++ b/include/xen/arm/swiotlb-xen.h > @@ -0,0 +1,10 @@ > +#ifndef _ASM_ARM_XEN_SWIOTLB_XEN_H > +#define _ASM_ARM_XEN_SWIOTLB_XEN_H > + > +#ifdef CONFIG_SWIOTLB_XEN > +extern int xen_swiotlb; > +#else > +#define xen_swiotlb (0) > +#endif > + > +#endif > diff --git a/include/xen/xen.h b/include/xen/xen.h > index 0e21567..74a536d 100644 > --- a/include/xen/xen.h > +++ b/include/xen/xen.h > @@ -46,4 +46,10 @@ enum xen_domain_type { > bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > const struct bio_vec *vec2); > > +#include <asm/xen/swiotlb-xen.h> > +static inline int xen_vring_use_dma(void) > +{ > + return !!xen_swiotlb; Given xen_swiotlb is only defined on arm, how will this build on other architectures? > +} > + > #endif /* _XEN_XEN_H */ I'd say at this point, I'm sorry we didn't come up with PLATFORM_ACCESS when we added the xen hack. I'm not objecting to this patch but I would also like to bypass the xen hack for VIRTIO 1 devices. In particular I know rpmsg is still virtio 0 so it's not an issue for it. Is Xen already using VIRTIO 1, and without setting PLATFORM_ACCESS? If not I would like to teach vring_use_dma_api to return false on VIRTIO_1 && !PLATFORM_ACCESS. Would that be acceptable?
On Tue, Jan 22, 2019 at 11:59:31AM -0800, Stefano Stabellini wrote: > > if (!virtio_has_iommu_quirk(vdev)) > > return true; > > > > @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > */ > > - if (xen_domain()) > > + if (xen_domain() && !dma_dev->dma_mem) > > return true; > > > > return false; > > I can see you spotted a real issue, but this is not the right fix. We > just need something a bit more flexible than xen_domain(): there are > many kinds of Xen domains on different architectures, we basically want > to enable this (return true from vring_use_dma_api) only when the xen > swiotlb is meant to be used. Does the appended patch fix the issue you > have? The problem generally is the other way around - if dma_dev->dma_mem is set the device decription in the device tree explicitly requires using this memory, so we must _always_ use the DMA API. The problem is just that that rproc driver absuses the DMA API in horrible ways.
On Tue, 22 Jan 2019, hch@infradead.org wrote: > On Tue, Jan 22, 2019 at 11:59:31AM -0800, Stefano Stabellini wrote: > > > if (!virtio_has_iommu_quirk(vdev)) > > > return true; > > > > > > @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) > > > * the DMA API if we're a Xen guest, which at least allows > > > * all of the sensible Xen configurations to work correctly. > > > */ > > > - if (xen_domain()) > > > + if (xen_domain() && !dma_dev->dma_mem) > > > return true; > > > > > > return false; > > > > I can see you spotted a real issue, but this is not the right fix. We > > just need something a bit more flexible than xen_domain(): there are > > many kinds of Xen domains on different architectures, we basically want > > to enable this (return true from vring_use_dma_api) only when the xen > > swiotlb is meant to be used. Does the appended patch fix the issue you > > have? > > The problem generally is the other way around - if dma_dev->dma_mem > is set the device decription in the device tree explicitly requires > using this memory, so we must _always_ use the DMA API. > > The problem is just that that rproc driver absuses the DMA API > in horrible ways. If vring_use_dma_api is actually supposed to return true when dma_dev->dma_mem is set, then both Peng's patch and the patch I wrote are not fixing the real issue here. I don't know enough about remoteproc to know where the problem actually lies though.
On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote: > If vring_use_dma_api is actually supposed to return true when > dma_dev->dma_mem is set, then both Peng's patch and the patch I wrote > are not fixing the real issue here. > > I don't know enough about remoteproc to know where the problem actually > lies though. The problem is the following: Devices can declare a specific memory region that they want to use when the driver calls dma_alloc_coherent for the device, this is done using the shared-dma-pool DT attribute, which comes in two variants that would be a little to much to explain here. remoteproc makes use of that because apparently the device can only communicate using that region. But it then feeds back memory obtained with dma_alloc_coherent into the virtio code. For that it calls vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for the ĐMA API and only worked accidentally on a few platform, and apparently arm64 just changed a few internals that made it stop working for remoteproc. The right answer is to not use the DMA API to allocate memory from a device-speficic region, but to tie the driver directly into the DT reserved memory API in a way that allows it to easilt obtain a struct device for it. This is orthogonal to another issue, and that is that hardware virtio devices really always need to use the DMA API, otherwise we'll bypass such features as the device specific DMA pools, DMA offsets, cache flushing, etc, etc.
On Wed, 23 Jan 2019, hch@infradead.org wrote: > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote: > > If vring_use_dma_api is actually supposed to return true when > > dma_dev->dma_mem is set, then both Peng's patch and the patch I wrote > > are not fixing the real issue here. > > > > I don't know enough about remoteproc to know where the problem actually > > lies though. > > The problem is the following: > > Devices can declare a specific memory region that they want to use when > the driver calls dma_alloc_coherent for the device, this is done using > the shared-dma-pool DT attribute, which comes in two variants that > would be a little to much to explain here. > > remoteproc makes use of that because apparently the device can > only communicate using that region. But it then feeds back memory > obtained with dma_alloc_coherent into the virtio code. For that > it calls vmalloc_to_page on the dma_alloc_coherent, which is a huge > no-go for the ĐMA API and only worked accidentally on a few platform, > and apparently arm64 just changed a few internals that made it stop > working for remoteproc. > > The right answer is to not use the DMA API to allocate memory from > a device-speficic region, but to tie the driver directly into the > DT reserved memory API in a way that allows it to easilt obtain > a struct device for it. If I understand correctly, Peng should be able to reproduce the problem on native Linux without any Xen involvement simply by forcing vring_use_dma_api to return true. Peng, can you confirm? And the right fix is not to call vmalloc_to_page on a dma_alloc_coherent buffer -- I don't know about the recent changes on arm64, but that's not going to work with arm32 either AFAIK. Given that I don't have a repro, I'll leave it to Peng and/or others to send the appropriate patch for remoteproc. > This is orthogonal to another issue, and that is that hardware > virtio devices really always need to use the DMA API, otherwise > we'll bypass such features as the device specific DMA pools, > DMA offsets, cache flushing, etc, etc. I understand, I'll drop my patch.
> -----Original Message----- > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > Sent: 2019年1月23日 4:00 > To: Peng Fan <peng.fan@nxp.com> > Cc: mst@redhat.com; jasowang@redhat.com; sstabellini@kernel.org; > hch@infradead.org; xen-devel@lists.xenproject.org; > linux-remoteproc@vger.kernel.org; linux-kernel@vger.kernel.org; > virtualization@lists.linux-foundation.org; luto@kernel.org; jgross@suse.com; > boris.ostrovsky@oracle.com > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain > > On Mon, 21 Jan 2019, Peng Fan wrote: > > on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed > > address as the dma mem buffer which is predefined. > > > > Without this patch, the flow is: > > vring_map_one_sg -> vring_use_dma_api > > -> dma_map_page > > -> __swiotlb_map_page > > ->swiotlb_map_page > > ->__dma_map_area(phys_to_virt(dma_to_phys(dev, > dev_addr)), size, > > dir); However we are using per device dma area for rpmsg, phys_to_virt > > could not return a correct virtual address for virtual address in > > vmalloc area. Then kernel panic. > > > > With this patch, vring_use_dma_api will return false, and > > vring_map_one_sg will return sg_phys(sg) which is the correct phys > > address in the predefined memory region. > > vring_map_one_sg -> vring_use_dma_api > > -> sg_phys(sg) > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > --- > > drivers/virtio/virtio_ring.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/virtio/virtio_ring.c > > b/drivers/virtio/virtio_ring.c index cd7e755484e3..8993d7cb3592 100644 > > --- a/drivers/virtio/virtio_ring.c > > +++ b/drivers/virtio/virtio_ring.c > > @@ -248,6 +248,8 @@ static inline bool virtqueue_use_indirect(struct > > virtqueue *_vq, > > > > static bool vring_use_dma_api(struct virtio_device *vdev) { > > + struct device *dma_dev = vdev->dev.parent; > > + > > if (!virtio_has_iommu_quirk(vdev)) > > return true; > > > > @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device > *vdev) > > * the DMA API if we're a Xen guest, which at least allows > > * all of the sensible Xen configurations to work correctly. > > */ > > - if (xen_domain()) > > + if (xen_domain() && !dma_dev->dma_mem) > > return true; > > > > return false; > > I can see you spotted a real issue, but this is not the right fix. We just need > something a bit more flexible than xen_domain(): there are many kinds of Xen > domains on different architectures, we basically want to enable this (return > true from vring_use_dma_api) only when the xen swiotlb is meant to be used. > Does the appended patch fix the issue you have? > > --- > > xen: introduce xen_vring_use_dma > > From: Stefano Stabellini <stefanos@xilinx.com> > > Export xen_swiotlb on arm and arm64. > > Use xen_swiotlb to determine when vring should use dma APIs to map the > ring: when xen_swiotlb is enabled the dma API is required. When it is disabled, > it is not required. > > Reported-by: Peng Fan <peng.fan@nxp.com> > Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> > > diff --git a/arch/arm/include/asm/xen/swiotlb-xen.h > b/arch/arm/include/asm/xen/swiotlb-xen.h > new file mode 100644 > index 0000000..455ade5 > --- /dev/null > +++ b/arch/arm/include/asm/xen/swiotlb-xen.h > @@ -0,0 +1 @@ > +#include <xen/arm/swiotlb-xen.h> > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c index > cb44aa2..8592863 100644 > --- a/arch/arm/xen/mm.c > +++ b/arch/arm/xen/mm.c > @@ -21,6 +21,8 @@ > #include <asm/xen/hypercall.h> > #include <asm/xen/interface.h> > > +int xen_swiotlb __read_mostly; > + > unsigned long xen_get_swiotlb_free_pages(unsigned int order) { > struct memblock_region *reg; > @@ -189,6 +191,7 @@ int __init xen_mm_init(void) > struct gnttab_cache_flush cflush; > if (!xen_initial_domain()) > return 0; > + xen_swiotlb = 1; > xen_swiotlb_init(1, false); > xen_dma_ops = &xen_swiotlb_dma_ops; > > diff --git a/arch/arm64/include/asm/xen/swiotlb-xen.h > b/arch/arm64/include/asm/xen/swiotlb-xen.h > new file mode 100644 > index 0000000..455ade5 > --- /dev/null > +++ b/arch/arm64/include/asm/xen/swiotlb-xen.h > @@ -0,0 +1 @@ > +#include <xen/arm/swiotlb-xen.h> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index > cd7e755..bf8badc 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -260,7 +260,7 @@ static bool vring_use_dma_api(struct virtio_device > *vdev) > * the DMA API if we're a Xen guest, which at least allows > * all of the sensible Xen configurations to work correctly. > */ > - if (xen_domain()) > + if (xen_vring_use_dma()) > return true; > > return false; > diff --git a/include/xen/arm/swiotlb-xen.h b/include/xen/arm/swiotlb-xen.h > new file mode 100644 index 0000000..2aac7c4 > --- /dev/null > +++ b/include/xen/arm/swiotlb-xen.h > @@ -0,0 +1,10 @@ > +#ifndef _ASM_ARM_XEN_SWIOTLB_XEN_H > +#define _ASM_ARM_XEN_SWIOTLB_XEN_H > + > +#ifdef CONFIG_SWIOTLB_XEN > +extern int xen_swiotlb; > +#else > +#define xen_swiotlb (0) > +#endif > + > +#endif > diff --git a/include/xen/xen.h b/include/xen/xen.h index 0e21567..74a536d > 100644 > --- a/include/xen/xen.h > +++ b/include/xen/xen.h > @@ -46,4 +46,10 @@ enum xen_domain_type { bool > xen_biovec_phys_mergeable(const struct bio_vec *vec1, > const struct bio_vec *vec2); > > +#include <asm/xen/swiotlb-xen.h> > +static inline int xen_vring_use_dma(void) { > + return !!xen_swiotlb; > +} > + > #endif /* _XEN_XEN_H */ Tested-by: Peng Fan <peng.fan@nxp.com> Thanks, Peng
Hi stefano, > -----Original Message----- > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > Sent: 2019年1月24日 7:44 > To: hch@infradead.org > Cc: Stefano Stabellini <sstabellini@kernel.org>; Peng Fan > <peng.fan@nxp.com>; mst@redhat.com; jasowang@redhat.com; > xen-devel@lists.xenproject.org; linux-remoteproc@vger.kernel.org; > linux-kernel@vger.kernel.org; virtualization@lists.linux-foundation.org; > luto@kernel.org; jgross@suse.com; boris.ostrovsky@oracle.com; > bjorn.andersson@linaro.org; jliang@xilinx.com > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain > > On Wed, 23 Jan 2019, hch@infradead.org wrote: > > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote: > > > If vring_use_dma_api is actually supposed to return true when > > > dma_dev->dma_mem is set, then both Peng's patch and the patch I > > > wrote are not fixing the real issue here. > > > > > > I don't know enough about remoteproc to know where the problem > > > actually lies though. > > > > The problem is the following: > > > > Devices can declare a specific memory region that they want to use > > when the driver calls dma_alloc_coherent for the device, this is done > > using the shared-dma-pool DT attribute, which comes in two variants > > that would be a little to much to explain here. > > > > remoteproc makes use of that because apparently the device can only > > communicate using that region. But it then feeds back memory obtained > > with dma_alloc_coherent into the virtio code. For that it calls > > vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for > > the ĐMA API and only worked accidentally on a few platform, and > > apparently arm64 just changed a few internals that made it stop > > working for remoteproc. > > > > The right answer is to not use the DMA API to allocate memory from a > > device-speficic region, but to tie the driver directly into the DT > > reserved memory API in a way that allows it to easilt obtain a struct > > device for it. > > If I understand correctly, Peng should be able to reproduce the problem on > native Linux without any Xen involvement simply by forcing > vring_use_dma_api to return true. Peng, can you confirm? It is another issue without xen involvement, There is an thread talking this: https://patchwork.kernel.org/patch/10742923/ Without xen, vring_use_dma_api will return false. With xen, if vring_use_dma_api returns true, it will dma_map_xx and trigger dump. Thanks, Peng. > > And the right fix is not to call vmalloc_to_page on a dma_alloc_coherent > buffer -- I don't know about the recent changes on arm64, but that's not going > to work with arm32 either AFAIK. Given that I don't have a repro, I'll leave it to > Peng and/or others to send the appropriate patch for remoteproc. > > > > This is orthogonal to another issue, and that is that hardware virtio > > devices really always need to use the DMA API, otherwise we'll bypass > > such features as the device specific DMA pools, DMA offsets, cache > > flushing, etc, etc. > > I understand, I'll drop my patch.
On Thu, 24 Jan 2019, Peng Fan wrote: > Hi stefano, > > > -----Original Message----- > > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > > Sent: 2019年1月24日 7:44 > > To: hch@infradead.org > > Cc: Stefano Stabellini <sstabellini@kernel.org>; Peng Fan > > <peng.fan@nxp.com>; mst@redhat.com; jasowang@redhat.com; > > xen-devel@lists.xenproject.org; linux-remoteproc@vger.kernel.org; > > linux-kernel@vger.kernel.org; virtualization@lists.linux-foundation.org; > > luto@kernel.org; jgross@suse.com; boris.ostrovsky@oracle.com; > > bjorn.andersson@linaro.org; jliang@xilinx.com > > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain > > > > On Wed, 23 Jan 2019, hch@infradead.org wrote: > > > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote: > > > > If vring_use_dma_api is actually supposed to return true when > > > > dma_dev->dma_mem is set, then both Peng's patch and the patch I > > > > wrote are not fixing the real issue here. > > > > > > > > I don't know enough about remoteproc to know where the problem > > > > actually lies though. > > > > > > The problem is the following: > > > > > > Devices can declare a specific memory region that they want to use > > > when the driver calls dma_alloc_coherent for the device, this is done > > > using the shared-dma-pool DT attribute, which comes in two variants > > > that would be a little to much to explain here. > > > > > > remoteproc makes use of that because apparently the device can only > > > communicate using that region. But it then feeds back memory obtained > > > with dma_alloc_coherent into the virtio code. For that it calls > > > vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for > > > the ĐMA API and only worked accidentally on a few platform, and > > > apparently arm64 just changed a few internals that made it stop > > > working for remoteproc. > > > > > > The right answer is to not use the DMA API to allocate memory from a > > > device-speficic region, but to tie the driver directly into the DT > > > reserved memory API in a way that allows it to easilt obtain a struct > > > device for it. > > > > If I understand correctly, Peng should be able to reproduce the problem on > > native Linux without any Xen involvement simply by forcing > > vring_use_dma_api to return true. Peng, can you confirm? > > It is another issue without xen involvement, > There is an thread talking this: https://patchwork.kernel.org/patch/10742923/ > > Without xen, vring_use_dma_api will return false. > With xen, if vring_use_dma_api returns true, it will dma_map_xx and trigger dump. It is true that for Xen on ARM DomUs it is not necessary today to return true from vring_use_dma_api. However, returning true from vring_use_dma_api should not break Linux. When the rpmesg issue is fixed, this problem should also go away without any need for additional changes on the xen side I think.
On Thu, Jan 24, 2019 at 11:14:53AM -0800, Stefano Stabellini wrote: > On Thu, 24 Jan 2019, Peng Fan wrote: > > Hi stefano, > > > > > -----Original Message----- > > > From: Stefano Stabellini [mailto:sstabellini@kernel.org] > > > Sent: 2019年1月24日 7:44 > > > To: hch@infradead.org > > > Cc: Stefano Stabellini <sstabellini@kernel.org>; Peng Fan > > > <peng.fan@nxp.com>; mst@redhat.com; jasowang@redhat.com; > > > xen-devel@lists.xenproject.org; linux-remoteproc@vger.kernel.org; > > > linux-kernel@vger.kernel.org; virtualization@lists.linux-foundation.org; > > > luto@kernel.org; jgross@suse.com; boris.ostrovsky@oracle.com; > > > bjorn.andersson@linaro.org; jliang@xilinx.com > > > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain > > > > > > On Wed, 23 Jan 2019, hch@infradead.org wrote: > > > > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote: > > > > > If vring_use_dma_api is actually supposed to return true when > > > > > dma_dev->dma_mem is set, then both Peng's patch and the patch I > > > > > wrote are not fixing the real issue here. > > > > > > > > > > I don't know enough about remoteproc to know where the problem > > > > > actually lies though. > > > > > > > > The problem is the following: > > > > > > > > Devices can declare a specific memory region that they want to use > > > > when the driver calls dma_alloc_coherent for the device, this is done > > > > using the shared-dma-pool DT attribute, which comes in two variants > > > > that would be a little to much to explain here. > > > > > > > > remoteproc makes use of that because apparently the device can only > > > > communicate using that region. But it then feeds back memory obtained > > > > with dma_alloc_coherent into the virtio code. For that it calls > > > > vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for > > > > the ĐMA API and only worked accidentally on a few platform, and > > > > apparently arm64 just changed a few internals that made it stop > > > > working for remoteproc. > > > > > > > > The right answer is to not use the DMA API to allocate memory from a > > > > device-speficic region, but to tie the driver directly into the DT > > > > reserved memory API in a way that allows it to easilt obtain a struct > > > > device for it. > > > > > > If I understand correctly, Peng should be able to reproduce the problem on > > > native Linux without any Xen involvement simply by forcing > > > vring_use_dma_api to return true. Peng, can you confirm? > > > > It is another issue without xen involvement, > > There is an thread talking this: https://patchwork.kernel.org/patch/10742923/ > > > > Without xen, vring_use_dma_api will return false. > > With xen, if vring_use_dma_api returns true, it will dma_map_xx and trigger dump. > > It is true that for Xen on ARM DomUs it is not necessary today to return > true from vring_use_dma_api. However, returning true from > vring_use_dma_api should not break Linux. When the rpmesg issue is > fixed, this problem should also go away without any need for additional > changes on the xen side I think. Let less systems bypass the standard virtio logic (using feature bit to figure out bypassing DMA API), the better.
Hi, > -----Original Message----- > From: hch@infradead.org [mailto:hch@infradead.org] > Sent: 2019年1月24日 5:14 > To: Stefano Stabellini <sstabellini@kernel.org> > Cc: hch@infradead.org; Peng Fan <peng.fan@nxp.com>; mst@redhat.com; > jasowang@redhat.com; xen-devel@lists.xenproject.org; > linux-remoteproc@vger.kernel.org; linux-kernel@vger.kernel.org; > virtualization@lists.linux-foundation.org; luto@kernel.org; jgross@suse.com; > boris.ostrovsky@oracle.com > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain > > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote: > > If vring_use_dma_api is actually supposed to return true when > > dma_dev->dma_mem is set, then both Peng's patch and the patch I wrote > > are not fixing the real issue here. > > > > I don't know enough about remoteproc to know where the problem > > actually lies though. > > The problem is the following: > > Devices can declare a specific memory region that they want to use when the > driver calls dma_alloc_coherent for the device, this is done using the > shared-dma-pool DT attribute, which comes in two variants that would be a > little to much to explain here. > > remoteproc makes use of that because apparently the device can only > communicate using that region. But it then feeds back memory obtained > with dma_alloc_coherent into the virtio code. For that it calls > vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for the > ĐMA API and only worked accidentally on a few platform, and apparently > arm64 just changed a few internals that made it stop working for remoteproc. > > The right answer is to not use the DMA API to allocate memory from a > device-speficic region, but to tie the driver directly into the DT reserved > memory API in a way that allows it to easilt obtain a struct device for it. Just have a question, Since vmalloc_to_page is ok for cma area, no need to take cma and per device cma into consideration right? we only need to implement a piece code to handle per device specific region using RESERVEDMEM_OF_DECLARE, just like: RESERVEDMEM_OF_DECLARE(rpmsg-dma, "rpmsg-dma-pool", rmem_rpmsg_dma_setup); And implement the device_init call back and build a map between page and phys. Then in rpmsg driver, scatter list could use page structure, no need vmalloc_to_page for per device dma. Is this the right way? Thanks Peng. > > This is orthogonal to another issue, and that is that hardware virtio devices > really always need to use the DMA API, otherwise we'll bypass such features > as the device specific DMA pools, DMA offsets, cache flushing, etc, etc.
On Fri, 25 Jan 2019, Peng Fan wrote: > > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote: > > > If vring_use_dma_api is actually supposed to return true when > > > dma_dev->dma_mem is set, then both Peng's patch and the patch I wrote > > > are not fixing the real issue here. > > > > > > I don't know enough about remoteproc to know where the problem > > > actually lies though. > > > > The problem is the following: > > > > Devices can declare a specific memory region that they want to use when the > > driver calls dma_alloc_coherent for the device, this is done using the > > shared-dma-pool DT attribute, which comes in two variants that would be a > > little to much to explain here. > > > > remoteproc makes use of that because apparently the device can only > > communicate using that region. But it then feeds back memory obtained > > with dma_alloc_coherent into the virtio code. For that it calls > > vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for the > > ĐMA API and only worked accidentally on a few platform, and apparently > > arm64 just changed a few internals that made it stop working for remoteproc. > > > > The right answer is to not use the DMA API to allocate memory from a > > device-speficic region, but to tie the driver directly into the DT reserved > > memory API in a way that allows it to easilt obtain a struct device for it. > > Just have a question, > > Since vmalloc_to_page is ok for cma area, no need to take cma and per device > cma into consideration right? > > we only need to implement a piece code to handle per device specific region > using RESERVEDMEM_OF_DECLARE, just like: > RESERVEDMEM_OF_DECLARE(rpmsg-dma, "rpmsg-dma-pool", > rmem_rpmsg_dma_setup); > And implement the device_init call back and build a map between page and phys. > Then in rpmsg driver, scatter list could use page structure, no need vmalloc_to_page > for per device dma. > > Is this the right way? I CC'ed the rpmsg maintainers, you want to keep them in the loop on this.
On Fri, Jan 25, 2019 at 09:45:26AM +0000, Peng Fan wrote: > Just have a question, > > Since vmalloc_to_page is ok for cma area, no need to take cma and per device > cma into consideration right? The CMA area itself it a physical memory region. If it is a non-highmem region you can call virt_to_page on the virtual addresses for it. If it is in highmem it doesn't even have a kernel virtual address by default. > we only need to implement a piece code to handle per device specific region > using RESERVEDMEM_OF_DECLARE, just like: > RESERVEDMEM_OF_DECLARE(rpmsg-dma, "rpmsg-dma-pool", > rmem_rpmsg_dma_setup); > And implement the device_init call back and build a map between page and phys. > Then in rpmsg driver, scatter list could use page structure, no need vmalloc_to_page > for per device dma. > > Is this the right way? I think this should work fine. If you have the cycles for it I'd actually love to be able to have generic CMA DT glue for non DMA API driver allocations, as there obviously is a need for it. So basically the same as above, just added to kernel/cma.c as a generic API.
> -----Original Message----- > From: hch@infradead.org [mailto:hch@infradead.org] > Sent: 2019年1月28日 16:00 > To: Peng Fan <peng.fan@nxp.com> > Cc: hch@infradead.org; Stefano Stabellini <sstabellini@kernel.org>; > mst@redhat.com; jasowang@redhat.com; xen-devel@lists.xenproject.org; > linux-remoteproc@vger.kernel.org; linux-kernel@vger.kernel.org; > virtualization@lists.linux-foundation.org; luto@kernel.org; jgross@suse.com; > boris.ostrovsky@oracle.com; Andy Duan <fugang.duan@nxp.com> > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain > > On Fri, Jan 25, 2019 at 09:45:26AM +0000, Peng Fan wrote: > > Just have a question, > > > > Since vmalloc_to_page is ok for cma area, no need to take cma and per > > device cma into consideration right? > > The CMA area itself it a physical memory region. If it is a non-highmem > region you can call virt_to_page on the virtual addresses for it. If it is in > highmem it doesn't even have a kernel virtual address by default. > > > we only need to implement a piece code to handle per device specific > > region using RESERVEDMEM_OF_DECLARE, just like: > > RESERVEDMEM_OF_DECLARE(rpmsg-dma, "rpmsg-dma-pool", > > rmem_rpmsg_dma_setup); And implement the device_init call back and > > build a map between page and phys. > > Then in rpmsg driver, scatter list could use page structure, no need > > vmalloc_to_page for per device dma. > > > > Is this the right way? > > I think this should work fine. If you have the cycles for it I'd actually love to > be able to have generic CMA DT glue for non DMA API driver allocations, as > there obviously is a need for it. So basically the same as above, just added > to kernel/cma.c as a generic API. Thanks for the hints. I'll try to add that. Thanks, Peng.
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cd7e755484e3..8993d7cb3592 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -248,6 +248,8 @@ static inline bool virtqueue_use_indirect(struct virtqueue *_vq, static bool vring_use_dma_api(struct virtio_device *vdev) { + struct device *dma_dev = vdev->dev.parent; + if (!virtio_has_iommu_quirk(vdev)) return true; @@ -260,7 +262,7 @@ static bool vring_use_dma_api(struct virtio_device *vdev) * the DMA API if we're a Xen guest, which at least allows * all of the sensible Xen configurations to work correctly. */ - if (xen_domain()) + if (xen_domain() && !dma_dev->dma_mem) return true; return false;
on i.MX8QM, M4_1 is communicating with DomU using rpmsg with a fixed address as the dma mem buffer which is predefined. Without this patch, the flow is: vring_map_one_sg -> vring_use_dma_api -> dma_map_page -> __swiotlb_map_page ->swiotlb_map_page ->__dma_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir); However we are using per device dma area for rpmsg, phys_to_virt could not return a correct virtual address for virtual address in vmalloc area. Then kernel panic. With this patch, vring_use_dma_api will return false, and vring_map_one_sg will return sg_phys(sg) which is the correct phys address in the predefined memory region. vring_map_one_sg -> vring_use_dma_api -> sg_phys(sg) Signed-off-by: Peng Fan <peng.fan@nxp.com> --- drivers/virtio/virtio_ring.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)