diff mbox series

swiotlb: Max mapping size takes min align mask into account

Message ID 20220510142109.777738-1-ltykernel@gmail.com (mailing list archive)
State Not Applicable
Headers show
Series swiotlb: Max mapping size takes min align mask into account | expand

Commit Message

Tianyu Lan May 10, 2022, 2:21 p.m. UTC
From: Tianyu Lan <Tianyu.Lan@microsoft.com>

swiotlb_find_slots() skips slots according to io tlb aligned mask
calculated from min aligned mask and original physical address
offset. This affects max mapping size. The mapping size can't
achieve the IO_TLB_SEGSIZE * IO_TLB_SIZE when original offset is
non-zero. This will cause system boot up failure in Hyper-V
Isolation VM where swiotlb force is enabled. Scsi layer use return
value of dma_max_mapping_size() to set max segment size and it
finally calls swiotlb_max_mapping_size(). Hyper-V storage driver
sets min align mask to 4k - 1. Scsi layer may pass 256k length of
request buffer with 0~4k offset and Hyper-V storage driver can't
get swiotlb bounce buffer via DMA API. Swiotlb_find_slots() can't
find 256k length bounce buffer with offset. Make swiotlb_max_mapping
_size() take min align mask into account.

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
 kernel/dma/swiotlb.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

Robin Murphy May 10, 2022, 4:33 p.m. UTC | #1
On 2022-05-10 15:21, Tianyu Lan wrote:
> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> 
> swiotlb_find_slots() skips slots according to io tlb aligned mask
> calculated from min aligned mask and original physical address
> offset. This affects max mapping size. The mapping size can't
> achieve the IO_TLB_SEGSIZE * IO_TLB_SIZE when original offset is
> non-zero. This will cause system boot up failure in Hyper-V
> Isolation VM where swiotlb force is enabled. Scsi layer use return
> value of dma_max_mapping_size() to set max segment size and it
> finally calls swiotlb_max_mapping_size(). Hyper-V storage driver
> sets min align mask to 4k - 1. Scsi layer may pass 256k length of
> request buffer with 0~4k offset and Hyper-V storage driver can't
> get swiotlb bounce buffer via DMA API. Swiotlb_find_slots() can't
> find 256k length bounce buffer with offset. Make swiotlb_max_mapping
> _size() take min align mask into account.

Hmm, this seems a bit pessimistic - the offset can vary per mapping, so 
it feels to me like it should really be the caller's responsibility to 
account for it if they're already involved enough to care about both 
constraints. But I'm not sure how practical that would be.

Robin.

> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
>   kernel/dma/swiotlb.c | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 73a41cec9e38..0d6684ca7eab 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -743,7 +743,18 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
>   
>   size_t swiotlb_max_mapping_size(struct device *dev)
>   {
> -	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
> +	int min_align_mask = dma_get_min_align_mask(dev);
> +	int min_align = 0;
> +
> +	/*
> +	 * swiotlb_find_slots() skips slots according to
> +	 * min align mask. This affects max mapping size.
> +	 * Take it into acount here.
> +	 */
> +	if (min_align_mask)
> +		min_align = roundup(min_align_mask, IO_TLB_SIZE);
> +
> +	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE - min_align;
>   }
>   
>   bool is_swiotlb_active(struct device *dev)
Michael Kelley (LINUX) May 10, 2022, 6:26 p.m. UTC | #2
From: Robin Murphy <robin.murphy@arm.com> Sent: Tuesday, May 10, 2022 9:34 AM
> 
> On 2022-05-10 15:21, Tianyu Lan wrote:
> > From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> >
> > swiotlb_find_slots() skips slots according to io tlb aligned mask
> > calculated from min aligned mask and original physical address
> > offset. This affects max mapping size. The mapping size can't
> > achieve the IO_TLB_SEGSIZE * IO_TLB_SIZE when original offset is
> > non-zero. This will cause system boot up failure in Hyper-V
> > Isolation VM where swiotlb force is enabled. Scsi layer use return
> > value of dma_max_mapping_size() to set max segment size and it
> > finally calls swiotlb_max_mapping_size(). Hyper-V storage driver
> > sets min align mask to 4k - 1. Scsi layer may pass 256k length of
> > request buffer with 0~4k offset and Hyper-V storage driver can't
> > get swiotlb bounce buffer via DMA API. Swiotlb_find_slots() can't
> > find 256k length bounce buffer with offset. Make swiotlb_max_mapping
> > _size() take min align mask into account.
> 
> Hmm, this seems a bit pessimistic - the offset can vary per mapping, so
> it feels to me like it should really be the caller's responsibility to
> account for it if they're already involved enough to care about both
> constraints. But I'm not sure how practical that would be.

Tianyu and I discussed this prior to his submitting the patch.
Presumably dma_max_mapping_size() exists so that the higher
level blk-mq code can limit the size of I/O requests to something
that will "fit" in the swiotlb when bounce buffering is enabled.
Unfortunately, the current code is just giving the wrong answer
when the offset is non-zero.  The offset would be less than
PAGE_SIZE, so the impact would be dma_max_mapping_size()
returning 252 Kbytes instead of 256 Kbytes, but only for devices
where dma min align mask is set.  And any I/O sizes less than
252 Kbytes are unaffected even when dma min align mask is set. 
Net, the impact would be only in a fairly rare edge case.

Even on ARM64 with a 64K page size, the Hyper-V storage driver
is setting the dma min align mask to only 4K (which is correct because
the Hyper-V host uses a 4K page size even if the guest is using
something larger), so again the limit becomes 252 Kbytes instead
of 256 Kbytes, and any impact is rare.

As you mentioned, how else would a caller handle this situation?

Michael

> 
> Robin.
> 
> > Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> > ---
> >   kernel/dma/swiotlb.c | 13 ++++++++++++-
> >   1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> > index 73a41cec9e38..0d6684ca7eab 100644
> > --- a/kernel/dma/swiotlb.c
> > +++ b/kernel/dma/swiotlb.c
> > @@ -743,7 +743,18 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t
> paddr, size_t size,
> >
> >   size_t swiotlb_max_mapping_size(struct device *dev)
> >   {
> > -	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
> > +	int min_align_mask = dma_get_min_align_mask(dev);
> > +	int min_align = 0;
> > +
> > +	/*
> > +	 * swiotlb_find_slots() skips slots according to
> > +	 * min align mask. This affects max mapping size.
> > +	 * Take it into acount here.
> > +	 */
> > +	if (min_align_mask)
> > +		min_align = roundup(min_align_mask, IO_TLB_SIZE);
> > +
> > +	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE - min_align;
> >   }
> >
> >   bool is_swiotlb_active(struct device *dev)
Christoph Hellwig May 11, 2022, 6:02 a.m. UTC | #3
On Tue, May 10, 2022 at 06:26:55PM +0000, Michael Kelley (LINUX) wrote:
> > Hmm, this seems a bit pessimistic - the offset can vary per mapping, so
> > it feels to me like it should really be the caller's responsibility to
> > account for it if they're already involved enough to care about both
> > constraints. But I'm not sure how practical that would be.
> 
> Tianyu and I discussed this prior to his submitting the patch.
> Presumably dma_max_mapping_size() exists so that the higher
> level blk-mq code can limit the size of I/O requests to something
> that will "fit" in the swiotlb when bounce buffering is enabled.

Yes, the idea that upper level code doesn't need to care was very
much the idea behind dma_max_mapping_size().

> As you mentioned, how else would a caller handle this situation?

Well, we could look at dma_get_min_align_mask in the caller and do
the calculation there, but I really don't think that is a good idea.

So this patch looks sensible to me.
Christoph Hellwig May 17, 2022, 9:22 a.m. UTC | #4
Thanks,

applied to the dma-mapping for-next tree.
diff mbox series

Patch

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 73a41cec9e38..0d6684ca7eab 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -743,7 +743,18 @@  dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
 
 size_t swiotlb_max_mapping_size(struct device *dev)
 {
-	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
+	int min_align_mask = dma_get_min_align_mask(dev);
+	int min_align = 0;
+
+	/*
+	 * swiotlb_find_slots() skips slots according to
+	 * min align mask. This affects max mapping size.
+	 * Take it into acount here.
+	 */
+	if (min_align_mask)
+		min_align = roundup(min_align_mask, IO_TLB_SIZE);
+
+	return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE - min_align;
 }
 
 bool is_swiotlb_active(struct device *dev)