diff mbox

[v2,09/20] xen/biomerge: Don't allow biovec to be merge when Linux is not using 4KB page

Message ID 1436474552-31789-10-git-send-email-julien.grall@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Julien Grall July 9, 2015, 8:42 p.m. UTC
When Linux is using 64K page granularity, every page will be slipt in
multiple non-contiguous 4K MFN (page granularity of Xen).

I'm not sure how to handle efficiently the check to know whether we can
merge 2 biovec with a such case. So for now, always says that biovec are
not mergeable.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
    Changes in v2:
        - Remove the workaround and check if the Linux page granularity
        is the same as Xen or not
---
 drivers/xen/biomerge.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Konrad Rzeszutek Wilk July 10, 2015, 7:12 p.m. UTC | #1
On Thu, Jul 09, 2015 at 09:42:21PM +0100, Julien Grall wrote:
> When Linux is using 64K page granularity, every page will be slipt in
> multiple non-contiguous 4K MFN (page granularity of Xen).

But you don't care about that on the Linux layer I think?

As in, is there an SWIOTLB that does PFN to MFN and vice-versa
translation?

I thought that ARM guests are not exposed to the MFN<->PFN logic
and trying to figure that out to not screw up the DMA engine
on a PCIe device slurping up contingous MFNs which don't map
to contingous PFNs?

> 
> I'm not sure how to handle efficiently the check to know whether we can
> merge 2 biovec with a such case. So for now, always says that biovec are
> not mergeable.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>     Changes in v2:
>         - Remove the workaround and check if the Linux page granularity
>         is the same as Xen or not
> ---
>  drivers/xen/biomerge.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
> index 0edb91c..571567c 100644
> --- a/drivers/xen/biomerge.c
> +++ b/drivers/xen/biomerge.c
> @@ -6,10 +6,17 @@
>  bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
>  			       const struct bio_vec *vec2)
>  {
> +#if XEN_PAGE_SIZE == PAGE_SIZE
>  	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
>  	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
>  
>  	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
>  		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
> +#else
> +	/* XXX: bio_vec are not mergeable when using different page size in
> +	 * Xen and Linux
> +	 */
> +	return 0;
> +#endif
>  }
>  EXPORT_SYMBOL(xen_biovec_phys_mergeable);
> -- 
> 2.1.4
>
Julien Grall July 15, 2015, 8:56 a.m. UTC | #2
Hi Konrad,

On 10/07/2015 21:12, Konrad Rzeszutek Wilk wrote:
> On Thu, Jul 09, 2015 at 09:42:21PM +0100, Julien Grall wrote:
>> When Linux is using 64K page granularity, every page will be slipt in
>> multiple non-contiguous 4K MFN (page granularity of Xen).
>
> But you don't care about that on the Linux layer I think?

In general use case (i.e arch agnostic) we care about it. We don't want 
to merge 2 biovec if they are not living on the same MFNs.

> As in, is there an SWIOTLB that does PFN to MFN and vice-versa
> translation?
>
> I thought that ARM guests are not exposed to the MFN<->PFN logic
> and trying to figure that out to not screw up the DMA engine
> on a PCIe device slurping up contingous MFNs which don't map
> to contingous PFNs?


I will let these 2 questions for Stefano. He knows better than me 
swiotlb for ARM.

So far, I skipped swiotlb implementation for 64KB page granularity as 
I'm not sure what to do when a page is split across multiple MFNs.

Although I don't think this can happen with this specific series as:
	- The memory is a direct mapping so any Linux page is using contiguous 
MFNs.
	- Foreign mapping is using the 4KB of the Linux page. This is for an 
easier implementation.

For the latter, I plan to work on using the Linux page to map multiple 
foreign gfn. I have to talk with Stefano about it how to handle it.

Regards,
Stefano Stabellini July 16, 2015, 3:33 p.m. UTC | #3
On Fri, 10 Jul 2015, Konrad Rzeszutek Wilk wrote:
> On Thu, Jul 09, 2015 at 09:42:21PM +0100, Julien Grall wrote:
> > When Linux is using 64K page granularity, every page will be slipt in
> > multiple non-contiguous 4K MFN (page granularity of Xen).
> 
> But you don't care about that on the Linux layer I think?
> 
> As in, is there an SWIOTLB that does PFN to MFN and vice-versa
> translation?
> 
> I thought that ARM guests are not exposed to the MFN<->PFN logic
> and trying to figure that out to not screw up the DMA engine
> on a PCIe device slurping up contingous MFNs which don't map
> to contingous PFNs?

Dom0 is mapped 1:1, so pfn == mfn normally, however grant maps
unavoidably screw up the 1:1, so the swiotlb jumps in to save the day
when a foreign granted page is involved in a dma operation.

Regarding xen_biovec_phys_mergeable, we could check that all the pfn ==
mfn and return true in that case.


> > I'm not sure how to handle efficiently the check to know whether we can
> > merge 2 biovec with a such case. So for now, always says that biovec are
> > not mergeable.
> > 
> > Signed-off-by: Julien Grall <julien.grall@citrix.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> > Cc: David Vrabel <david.vrabel@citrix.com>
> > ---
> >     Changes in v2:
> >         - Remove the workaround and check if the Linux page granularity
> >         is the same as Xen or not
> > ---
> >  drivers/xen/biomerge.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
> > index 0edb91c..571567c 100644
> > --- a/drivers/xen/biomerge.c
> > +++ b/drivers/xen/biomerge.c
> > @@ -6,10 +6,17 @@
> >  bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
> >  			       const struct bio_vec *vec2)
> >  {
> > +#if XEN_PAGE_SIZE == PAGE_SIZE
> >  	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
> >  	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
> >  
> >  	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
> >  		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
> > +#else
> > +	/* XXX: bio_vec are not mergeable when using different page size in
> > +	 * Xen and Linux
> > +	 */
> > +	return 0;
> > +#endif
> >  }
> >  EXPORT_SYMBOL(xen_biovec_phys_mergeable);
> > -- 
> > 2.1.4
> > 
>
Julien Grall July 16, 2015, 4:15 p.m. UTC | #4
Hi Stefano,

On 16/07/2015 16:33, Stefano Stabellini wrote:
> On Fri, 10 Jul 2015, Konrad Rzeszutek Wilk wrote:
>> On Thu, Jul 09, 2015 at 09:42:21PM +0100, Julien Grall wrote:
>>> When Linux is using 64K page granularity, every page will be slipt in
>>> multiple non-contiguous 4K MFN (page granularity of Xen).
>>
>> But you don't care about that on the Linux layer I think?
>>
>> As in, is there an SWIOTLB that does PFN to MFN and vice-versa
>> translation?
>>
>> I thought that ARM guests are not exposed to the MFN<->PFN logic
>> and trying to figure that out to not screw up the DMA engine
>> on a PCIe device slurping up contingous MFNs which don't map
>> to contingous PFNs?
>
> Dom0 is mapped 1:1, so pfn == mfn normally, however grant maps
> unavoidably screw up the 1:1, so the swiotlb jumps in to save the day
> when a foreign granted page is involved in a dma operation.
>
> Regarding xen_biovec_phys_mergeable, we could check that all the pfn ==
> mfn and return true in that case.

I mentioned it in the commit message. Although, we would have to loop on 
every pfn which is slow on 64KB (16 times for every page). Given the 
biovec is called often, I don't think we can do a such things.

Regards,
Konrad Rzeszutek Wilk July 16, 2015, 6:30 p.m. UTC | #5
On Thu, Jul 16, 2015 at 05:15:41PM +0100, Julien Grall wrote:
> Hi Stefano,
> 
> On 16/07/2015 16:33, Stefano Stabellini wrote:
> >On Fri, 10 Jul 2015, Konrad Rzeszutek Wilk wrote:
> >>On Thu, Jul 09, 2015 at 09:42:21PM +0100, Julien Grall wrote:
> >>>When Linux is using 64K page granularity, every page will be slipt in
> >>>multiple non-contiguous 4K MFN (page granularity of Xen).
> >>
> >>But you don't care about that on the Linux layer I think?
> >>
> >>As in, is there an SWIOTLB that does PFN to MFN and vice-versa
> >>translation?
> >>
> >>I thought that ARM guests are not exposed to the MFN<->PFN logic
> >>and trying to figure that out to not screw up the DMA engine
> >>on a PCIe device slurping up contingous MFNs which don't map
> >>to contingous PFNs?
> >
> >Dom0 is mapped 1:1, so pfn == mfn normally, however grant maps
> >unavoidably screw up the 1:1, so the swiotlb jumps in to save the day
> >when a foreign granted page is involved in a dma operation.
> >
> >Regarding xen_biovec_phys_mergeable, we could check that all the pfn ==
> >mfn and return true in that case.
> 
> I mentioned it in the commit message. Although, we would have to loop on
> every pfn which is slow on 64KB (16 times for every page). Given the biovec
> is called often, I don't think we can do a such things.

OK - it would be good to have the gist of this email thread in the
commit message. Thanks.
> 
> Regards,
> 
> -- 
> Julien Grall
Stefano Stabellini July 17, 2015, 1:20 p.m. UTC | #6
On Thu, 16 Jul 2015, Julien Grall wrote:
> Hi Stefano,
> 
> On 16/07/2015 16:33, Stefano Stabellini wrote:
> > On Fri, 10 Jul 2015, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Jul 09, 2015 at 09:42:21PM +0100, Julien Grall wrote:
> > > > When Linux is using 64K page granularity, every page will be slipt in
> > > > multiple non-contiguous 4K MFN (page granularity of Xen).
> > > 
> > > But you don't care about that on the Linux layer I think?
> > > 
> > > As in, is there an SWIOTLB that does PFN to MFN and vice-versa
> > > translation?
> > > 
> > > I thought that ARM guests are not exposed to the MFN<->PFN logic
> > > and trying to figure that out to not screw up the DMA engine
> > > on a PCIe device slurping up contingous MFNs which don't map
> > > to contingous PFNs?
> > 
> > Dom0 is mapped 1:1, so pfn == mfn normally, however grant maps
> > unavoidably screw up the 1:1, so the swiotlb jumps in to save the day
> > when a foreign granted page is involved in a dma operation.
> > 
> > Regarding xen_biovec_phys_mergeable, we could check that all the pfn ==
> > mfn and return true in that case.
> 
> I mentioned it in the commit message. Although, we would have to loop on every
> pfn which is slow on 64KB (16 times for every page). Given the biovec is
> called often, I don't think we can do a such things.

We would have to run some benchmarks, but I think it would still be a
win. We should write an ad-hoc __pfn_to_mfn translation function that
operates on a range of pfns and simply checks whether an entry is
present in that range. It should be just as fast as __pfn_to_mfn. I
would definitely recommend it.
Julien Grall July 17, 2015, 2:44 p.m. UTC | #7
On 17/07/15 14:20, Stefano Stabellini wrote:
> We would have to run some benchmarks, but I think it would still be a
> win. We should write an ad-hoc __pfn_to_mfn translation function that
> operates on a range of pfns and simply checks whether an entry is
> present in that range. It should be just as fast as __pfn_to_mfn. I
> would definitely recommend it.

I'd like to see a basic support of 64KB support on Xen pushed in Linux
upstream before looking to possible improvement in the code. Can we
defer this as the follow-up of this series?

Regards,
Stefano Stabellini July 17, 2015, 2:45 p.m. UTC | #8
On Fri, 17 Jul 2015, Julien Grall wrote:
> On 17/07/15 14:20, Stefano Stabellini wrote:
> > We would have to run some benchmarks, but I think it would still be a
> > win. We should write an ad-hoc __pfn_to_mfn translation function that
> > operates on a range of pfns and simply checks whether an entry is
> > present in that range. It should be just as fast as __pfn_to_mfn. I
> > would definitely recommend it.
> 
> I'd like to see a basic support of 64KB support on Xen pushed in Linux
> upstream before looking to possible improvement in the code. Can we
> defer this as the follow-up of this series?

Yes, maybe add a TODO comment in the code.
Julien Grall July 17, 2015, 2:46 p.m. UTC | #9
On 17/07/15 15:45, Stefano Stabellini wrote:
> On Fri, 17 Jul 2015, Julien Grall wrote:
>> On 17/07/15 14:20, Stefano Stabellini wrote:
>>> We would have to run some benchmarks, but I think it would still be a
>>> win. We should write an ad-hoc __pfn_to_mfn translation function that
>>> operates on a range of pfns and simply checks whether an entry is
>>> present in that range. It should be just as fast as __pfn_to_mfn. I
>>> would definitely recommend it.
>>
>> I'd like to see a basic support of 64KB support on Xen pushed in Linux
>> upstream before looking to possible improvement in the code. Can we
>> defer this as the follow-up of this series?
> 
> Yes, maybe add a TODO comment in the code. 

Will do.

Regards,
diff mbox

Patch

diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
index 0edb91c..571567c 100644
--- a/drivers/xen/biomerge.c
+++ b/drivers/xen/biomerge.c
@@ -6,10 +6,17 @@ 
 bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
 			       const struct bio_vec *vec2)
 {
+#if XEN_PAGE_SIZE == PAGE_SIZE
 	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
 	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
 
 	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
 		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
+#else
+	/* XXX: bio_vec are not mergeable when using different page size in
+	 * Xen and Linux
+	 */
+	return 0;
+#endif
 }
 EXPORT_SYMBOL(xen_biovec_phys_mergeable);