diff mbox series

[v3,4/4] RDMA/umem: batch page unpin in __ib_umem_release()

Message ID 20210205204127.29441-5-joao.m.martins@oracle.com (mailing list archive)
State New, archived
Headers show
Series mm/gup: page unpining improvements | expand

Commit Message

Joao Martins Feb. 5, 2021, 8:41 p.m. UTC
Use the newly added unpin_user_page_range_dirty_lock()
for more quickly unpinning a consecutive range of pages
represented as compound pages. This will also calculate
number of pages to unpin (for the tail pages which matching
head page) and thus batch the refcount update.

Running a test program which calls mr reg/unreg on a 1G in size
and measures cost of both operations together (in a guest using rxe)
with THP and hugetlbfs:

Before:
590 rounds in 5.003 sec: 8480.335 usec / round
6898 rounds in 60.001 sec: 8698.367 usec / round

After:
2688 rounds in 5.002 sec: 1860.786 usec / round
32517 rounds in 60.001 sec: 1845.225 usec / round

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
 drivers/infiniband/core/umem.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Jason Gunthorpe Feb. 10, 2021, 11:17 p.m. UTC | #1
On Fri, Feb 05, 2021 at 08:41:27PM +0000, Joao Martins wrote:
> Use the newly added unpin_user_page_range_dirty_lock()
> for more quickly unpinning a consecutive range of pages
> represented as compound pages. This will also calculate
> number of pages to unpin (for the tail pages which matching
> head page) and thus batch the refcount update.
> 
> Running a test program which calls mr reg/unreg on a 1G in size
> and measures cost of both operations together (in a guest using rxe)
> with THP and hugetlbfs:
> 
> Before:
> 590 rounds in 5.003 sec: 8480.335 usec / round
> 6898 rounds in 60.001 sec: 8698.367 usec / round
> 
> After:
> 2688 rounds in 5.002 sec: 1860.786 usec / round
> 32517 rounds in 60.001 sec: 1845.225 usec / round
> 
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
>  drivers/infiniband/core/umem.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)

Would best for this to go through Andrew's tree

Acked-by: Jason Gunthorpe <jgg@nvidia.com>

4x improvement is pretty good!

Jason
Joao Martins Feb. 11, 2021, 10:35 a.m. UTC | #2
On 2/10/21 11:17 PM, Jason Gunthorpe wrote:
> On Fri, Feb 05, 2021 at 08:41:27PM +0000, Joao Martins wrote:
>> Use the newly added unpin_user_page_range_dirty_lock()
>> for more quickly unpinning a consecutive range of pages
>> represented as compound pages. This will also calculate
>> number of pages to unpin (for the tail pages which matching
>> head page) and thus batch the refcount update.
>>
>> Running a test program which calls mr reg/unreg on a 1G in size
>> and measures cost of both operations together (in a guest using rxe)
>> with THP and hugetlbfs:
>>
>> Before:
>> 590 rounds in 5.003 sec: 8480.335 usec / round
>> 6898 rounds in 60.001 sec: 8698.367 usec / round
>>
>> After:
>> 2688 rounds in 5.002 sec: 1860.786 usec / round
>> 32517 rounds in 60.001 sec: 1845.225 usec / round
>>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> ---
>>  drivers/infiniband/core/umem.c | 12 ++++++------
>>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> Would best for this to go through Andrew's tree
> 
> Acked-by: Jason Gunthorpe <jgg@nvidia.com>
> 
> 4x improvement is pretty good!
> 

It would only be half of that improvement if it wasn't for your
unpin_user_page_range_dirty_lock() suggestion, so thanks for all
the input :)

	Joao
diff mbox series

Patch

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 2dde99a9ba07..9b607013e2a2 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -47,17 +47,17 @@ 
 
 static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty)
 {
-	struct sg_page_iter sg_iter;
-	struct page *page;
+	bool make_dirty = umem->writable && dirty;
+	struct scatterlist *sg;
+	unsigned int i;
 
 	if (umem->nmap > 0)
 		ib_dma_unmap_sg(dev, umem->sg_head.sgl, umem->sg_nents,
 				DMA_BIDIRECTIONAL);
 
-	for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_nents, 0) {
-		page = sg_page_iter_page(&sg_iter);
-		unpin_user_pages_dirty_lock(&page, 1, umem->writable && dirty);
-	}
+	for_each_sg(umem->sg_head.sgl, sg, umem->sg_nents, i)
+		unpin_user_page_range_dirty_lock(sg_page(sg),
+			DIV_ROUND_UP(sg->length, PAGE_SIZE), make_dirty);
 
 	sg_free_table(&umem->sg_head);
 }