diff mbox series

lib/scatterlist: Fix to merge contiguous pages into the last SG properly

Message ID 20230105112339.107969-1-yishaih@nvidia.com (mailing list archive)
State New, archived
Headers show
Series lib/scatterlist: Fix to merge contiguous pages into the last SG properly | expand

Commit Message

Yishai Hadas Jan. 5, 2023, 11:23 a.m. UTC
When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
in its 'sgt_append->prv' flow to check whether it can merge contiguous
pages into the last SG, it passes the page arguments in the wrong order.

The first parameter should be the next candidate page to be merged to
the last page and not the opposite.

The current code leads to a corrupted SG which resulted in OOPs and
unexpected errors when non-contiguous pages are merged wrongly.

Fix to pass the page parameters in the right order.

Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
---
 lib/scatterlist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jason Gunthorpe Jan. 5, 2023, 1:36 p.m. UTC | #1
On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> in its 'sgt_append->prv' flow to check whether it can merge contiguous
> pages into the last SG, it passes the page arguments in the wrong order.
> 
> The first parameter should be the next candidate page to be merged to
> the last page and not the opposite.
> 
> The current code leads to a corrupted SG which resulted in OOPs and
> unexpected errors when non-contiguous pages are merged wrongly.
> 
> Fix to pass the page parameters in the right order.
> 
> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  lib/scatterlist.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Also, I'm looking more closely at '156 and this is not right either:

-               unsigned long paddr =
-                       (page_to_pfn(sg_page(sgt_append->prv)) * PAGE_SIZE +
-                        sgt_append->prv->offset + sgt_append->prv->length) /
-                       PAGE_SIZE;
-
-               while (n_pages && page_to_pfn(pages[0]) == paddr) {
+               last_pg = sg_page(sgt_append->prv);
+               while (n_pages && pages_are_mergeable(last_pg, pages[0])) {

This change will break things like multi-page combining, sub page
scenarios and maybe more.

The contiguity test here has to be done a phys, it should go back to
struct page to check if the pgmap is OK.

Can you fix it as well?

Thanks,
Jason
Yishai Hadas Jan. 5, 2023, 4:48 p.m. UTC | #2
On 05/01/2023 15:36, Jason Gunthorpe wrote:
> On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
>> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
>> in its 'sgt_append->prv' flow to check whether it can merge contiguous
>> pages into the last SG, it passes the page arguments in the wrong order.
>>
>> The first parameter should be the next candidate page to be merged to
>> the last page and not the opposite.
>>
>> The current code leads to a corrupted SG which resulted in OOPs and
>> unexpected errors when non-contiguous pages are merged wrongly.
>>
>> Fix to pass the page parameters in the right order.
>>
>> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>   lib/scatterlist.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Thanks Jason

>
> Also, I'm looking more closely at '156 and this is not right either:
>
> -               unsigned long paddr =
> -                       (page_to_pfn(sg_page(sgt_append->prv)) * PAGE_SIZE +
> -                        sgt_append->prv->offset + sgt_append->prv->length) /
> -                       PAGE_SIZE;
> -
> -               while (n_pages && page_to_pfn(pages[0]) == paddr) {
> +               last_pg = sg_page(sgt_append->prv);
> +               while (n_pages && pages_are_mergeable(last_pg, pages[0])) {
>
> This change will break things like multi-page combining, sub page
> scenarios and maybe more.
>
> The contiguity test here has to be done a phys, it should go back to
> struct page to check if the pgmap is OK.
>
> Can you fix it as well?


Yes, I have locally some candidate patch as you asked, on top of this one.

I would like to run some extra testing on, then may send it.

Yishai
Jason Gunthorpe Jan. 5, 2023, 8:06 p.m. UTC | #3
On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> in its 'sgt_append->prv' flow to check whether it can merge contiguous
> pages into the last SG, it passes the page arguments in the wrong order.
> 
> The first parameter should be the next candidate page to be merged to
> the last page and not the opposite.
> 
> The current code leads to a corrupted SG which resulted in OOPs and
> unexpected errors when non-contiguous pages are merged wrongly.
> 
> Fix to pass the page parameters in the right order.
> 
> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> ---
>  lib/scatterlist.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

rdma is pretty much the only user of this API and this bug is causing
bad data corruption, so I'm going to take it to the rdma tree and send
it tomorrow.

Which raises the question why the original patch was done at all,
nothing ever inputs pgmap pages into this function?

Thanks,
Jason
Keith Busch Jan. 5, 2023, 8:21 p.m. UTC | #4
On Thu, Jan 05, 2023 at 04:06:11PM -0400, Jason Gunthorpe wrote:
> On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> > When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> > in its 'sgt_append->prv' flow to check whether it can merge contiguous
> > pages into the last SG, it passes the page arguments in the wrong order.
> > 
> > The first parameter should be the next candidate page to be merged to
> > the last page and not the opposite.
> > 
> > The current code leads to a corrupted SG which resulted in OOPs and
> > unexpected errors when non-contiguous pages are merged wrongly.
> > 
> > Fix to pass the page parameters in the right order.
> > 
> > Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> > Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> > ---
> >  lib/scatterlist.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> rdma is pretty much the only user of this API and this bug is causing
> bad data corruption, so I'm going to take it to the rdma tree and send
> it tomorrow.
> 
> Which raises the question why the original patch was done at all,
> nothing ever inputs pgmap pages into this function?

This just takes any arbitrary user addresses, right? The user could
provide addresses from mmap'ing pci resource files that resolve to pgmap
pages.
Jason Gunthorpe Jan. 5, 2023, 8:23 p.m. UTC | #5
On Thu, Jan 05, 2023 at 01:21:43PM -0700, Keith Busch wrote:
> On Thu, Jan 05, 2023 at 04:06:11PM -0400, Jason Gunthorpe wrote:
> > On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> > > When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> > > in its 'sgt_append->prv' flow to check whether it can merge contiguous
> > > pages into the last SG, it passes the page arguments in the wrong order.
> > > 
> > > The first parameter should be the next candidate page to be merged to
> > > the last page and not the opposite.
> > > 
> > > The current code leads to a corrupted SG which resulted in OOPs and
> > > unexpected errors when non-contiguous pages are merged wrongly.
> > > 
> > > Fix to pass the page parameters in the right order.
> > > 
> > > Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> > > Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> > > ---
> > >  lib/scatterlist.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > rdma is pretty much the only user of this API and this bug is causing
> > bad data corruption, so I'm going to take it to the rdma tree and send
> > it tomorrow.
> > 
> > Which raises the question why the original patch was done at all,
> > nothing ever inputs pgmap pages into this function?
> 
> This just takes any arbitrary user addresses, right? The user could
> provide addresses from mmap'ing pci resource files that resolve to pgmap
> pages.

No, it passes FOLL_LONGTERM and pin_user_pages will not return any pgmaps
in that case.

Jason
Logan Gunthorpe Jan. 5, 2023, 8:23 p.m. UTC | #6
On 2023-01-05 13:06, Jason Gunthorpe wrote:
> On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
>> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
>> in its 'sgt_append->prv' flow to check whether it can merge contiguous
>> pages into the last SG, it passes the page arguments in the wrong order.
>>
>> The first parameter should be the next candidate page to be merged to
>> the last page and not the opposite.
>>
>> The current code leads to a corrupted SG which resulted in OOPs and
>> unexpected errors when non-contiguous pages are merged wrongly.
>>
>> Fix to pass the page parameters in the right order.
>>
>> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
>> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
>> ---
>>  lib/scatterlist.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> rdma is pretty much the only user of this API and this bug is causing
> bad data corruption, so I'm going to take it to the rdma tree and send
> it tomorrow.
> 
> Which raises the question why the original patch was done at all,
> nothing ever inputs pgmap pages into this function?

It was done solely because you had suggested it was necessary.

https://lore.kernel.org/all/20210929224653.GZ964074@nvidia.com/

Though when the patch was correct when I originally wrote it and it
looks like I merged it poorly somewhere along the line (roughly v5 of
the series) when the paddr stuff was added. Sorry about that.
The paddr stuff was messy and really hard to understand.

Anyway, Yishai's first patch looks correct to me, but I guess we need to
fix it further. For what it's worth:

Reviewed-by: Logan Gunthorpe <logang@deltatee.com>

Logan
Jason Gunthorpe Jan. 5, 2023, 8:25 p.m. UTC | #7
On Thu, Jan 05, 2023 at 01:23:52PM -0700, Logan Gunthorpe wrote:
> 
> 
> On 2023-01-05 13:06, Jason Gunthorpe wrote:
> > On Thu, Jan 05, 2023 at 01:23:39PM +0200, Yishai Hadas wrote:
> >> When sg_alloc_append_table_from_pages() calls to pages_are_mergeable()
> >> in its 'sgt_append->prv' flow to check whether it can merge contiguous
> >> pages into the last SG, it passes the page arguments in the wrong order.
> >>
> >> The first parameter should be the next candidate page to be merged to
> >> the last page and not the opposite.
> >>
> >> The current code leads to a corrupted SG which resulted in OOPs and
> >> unexpected errors when non-contiguous pages are merged wrongly.
> >>
> >> Fix to pass the page parameters in the right order.
> >>
> >> Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages")
> >> Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
> >> ---
> >>  lib/scatterlist.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > rdma is pretty much the only user of this API and this bug is causing
> > bad data corruption, so I'm going to take it to the rdma tree and send
> > it tomorrow.
> > 
> > Which raises the question why the original patch was done at all,
> > nothing ever inputs pgmap pages into this function?
> 
> It was done solely because you had suggested it was necessary.
> 
> https://lore.kernel.org/all/20210929224653.GZ964074@nvidia.com/

Yes, but that was when I was expecting this would work with
FOLL_LONGTERM and PUP..

Jason
diff mbox series

Patch

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index a0ad2a7959b5..f72aa50c6654 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -476,7 +476,7 @@  int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
 		/* Merge contiguous pages into the last SG */
 		prv_len = sgt_append->prv->length;
 		last_pg = sg_page(sgt_append->prv);
-		while (n_pages && pages_are_mergeable(last_pg, pages[0])) {
+		while (n_pages && pages_are_mergeable(pages[0], last_pg)) {
 			if (sgt_append->prv->length + PAGE_SIZE > max_segment)
 				break;
 			sgt_append->prv->length += PAGE_SIZE;