diff mbox series

[03/13] scatterlist: Add sg_set_folio()

Message ID 20230621164557.3510324-4-willy@infradead.org (mailing list archive)
State New, archived
Headers show
Series Remove pagevecs | expand

Commit Message

Matthew Wilcox June 21, 2023, 4:45 p.m. UTC
This wrapper for sg_set_page() lets drivers add folios to a scatterlist
more easily.  We could, perhaps, do better by using a different page
in the folio if offset is larger than UINT_MAX, but let's hope we get
a better data structure than this before we need to care about such
large folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/scatterlist.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

Comments

Zhu Yanjun July 30, 2023, 11:01 a.m. UTC | #1
在 2023/6/22 0:45, Matthew Wilcox (Oracle) 写道:
> This wrapper for sg_set_page() lets drivers add folios to a scatterlist
> more easily.  We could, perhaps, do better by using a different page
> in the folio if offset is larger than UINT_MAX, but let's hope we get
> a better data structure than this before we need to care about such
> large folios.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   include/linux/scatterlist.h | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
> 
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index ec46d8e8e49d..77df3d7b18a6 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -141,6 +141,30 @@ static inline void sg_set_page(struct scatterlist *sg, struct page *page,
>   	sg->length = len;
>   }
>   
> +/**
> + * sg_set_folio - Set sg entry to point at given folio
> + * @sg:		 SG entry
> + * @folio:	 The folio
> + * @len:	 Length of data
> + * @offset:	 Offset into folio
> + *
> + * Description:
> + *   Use this function to set an sg entry pointing at a folio, never assign
> + *   the folio directly. We encode sg table information in the lower bits
> + *   of the folio pointer. See sg_page() for looking up the page belonging
> + *   to an sg entry.
> + *
> + **/
> +static inline void sg_set_folio(struct scatterlist *sg, struct folio *folio,
> +			       size_t len, size_t offset)
> +{
> +	WARN_ON_ONCE(len > UINT_MAX);
> +	WARN_ON_ONCE(offset > UINT_MAX);
> +	sg_assign_page(sg, &folio->page);
> +	sg->offset = offset;
> +	sg->length = len;
> +}
> +

https://elixir.bootlin.com/linux/latest/source/lib/scatterlist.c#L451

Does the following function have folio version?

"
int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
		struct page **pages, unsigned int n_pages, unsigned int offset,
		unsigned long size, unsigned int max_segment,
		unsigned int left_pages, gfp_t gfp_mask)
"

Thanks a lot.
Zhu Yanjun

>   static inline struct page *sg_page(struct scatterlist *sg)
>   {
>   #ifdef CONFIG_DEBUG_SG
Matthew Wilcox July 30, 2023, 11:18 a.m. UTC | #2
On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
> Does the following function have folio version?
> 
> "
> int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
> 		struct page **pages, unsigned int n_pages, unsigned int offset,
> 		unsigned long size, unsigned int max_segment,
> 		unsigned int left_pages, gfp_t gfp_mask)
> "

No -- I haven't needed to convert anything that uses
sg_alloc_append_table_from_pages() yet.  It doesn't look like it should
be _too_ hard to add a folio version.
Zhu Yanjun July 30, 2023, 1:57 p.m. UTC | #3
在 2023/7/30 19:18, Matthew Wilcox 写道:
> On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
>> Does the following function have folio version?
>>
>> "
>> int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
>> 		struct page **pages, unsigned int n_pages, unsigned int offset,
>> 		unsigned long size, unsigned int max_segment,
>> 		unsigned int left_pages, gfp_t gfp_mask)
>> "
> No -- I haven't needed to convert anything that uses
> sg_alloc_append_table_from_pages() yet.  It doesn't look like it should
> be _too_ hard to add a folio version.

In many places, this function is used. So this function needs the folio 
version.

Another problem, after folio is used, I want to know the performance 
after folio is implemented.

How to make tests to get the performance?

Thanks a lot.

Zhu Yanjun
Matthew Wilcox July 30, 2023, 9:42 p.m. UTC | #4
On Sun, Jul 30, 2023 at 09:57:06PM +0800, Zhu Yanjun wrote:
> 
> 在 2023/7/30 19:18, Matthew Wilcox 写道:
> > On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
> > > Does the following function have folio version?
> > > 
> > > "
> > > int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
> > > 		struct page **pages, unsigned int n_pages, unsigned int offset,
> > > 		unsigned long size, unsigned int max_segment,
> > > 		unsigned int left_pages, gfp_t gfp_mask)
> > > "
> > No -- I haven't needed to convert anything that uses
> > sg_alloc_append_table_from_pages() yet.  It doesn't look like it should
> > be _too_ hard to add a folio version.
> 
> In many places, this function is used. So this function needs the folio
> version.

It's not used in very many places.  But the first one that I see it used
(drivers/infiniband/core/umem.c), you can't do a straightforward folio
conversion:

                pinned = pin_user_pages_fast(cur_base,
                                          min_t(unsigned long, npages,
                                                PAGE_SIZE /
                                                sizeof(struct page *)),
                                          gup_flags, page_list);
...
                ret = sg_alloc_append_table_from_pages(
                        &umem->sgt_append, page_list, pinned, 0,
                        pinned << PAGE_SHIFT, ib_dma_max_seg_size(device),
                        npages, GFP_KERNEL);

That can't be converted to folios.  The GUP might start in the middle of
the folio, and we have no way to communicate that.

This particular usage really needs the phyr work that Jason is doing so
we can efficiently communicate physically contiguous ranges from GUP
to sg.

> Another problem, after folio is used, I want to know the performance after
> folio is implemented.
> 
> How to make tests to get the performance?

You know what you're working on ... I wouldn't know how best to test
your code.
Zhu Yanjun Aug. 18, 2023, 7:05 a.m. UTC | #5
在 2023/7/31 5:42, Matthew Wilcox 写道:
> On Sun, Jul 30, 2023 at 09:57:06PM +0800, Zhu Yanjun wrote:
>> 在 2023/7/30 19:18, Matthew Wilcox 写道:
>>> On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
>>>> Does the following function have folio version?
>>>>
>>>> "
>>>> int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
>>>> 		struct page **pages, unsigned int n_pages, unsigned int offset,
>>>> 		unsigned long size, unsigned int max_segment,
>>>> 		unsigned int left_pages, gfp_t gfp_mask)
>>>> "
>>> No -- I haven't needed to convert anything that uses
>>> sg_alloc_append_table_from_pages() yet.  It doesn't look like it should
>>> be _too_ hard to add a folio version.
>> In many places, this function is used. So this function needs the folio
>> version.
> It's not used in very many places.  But the first one that I see it used
> (drivers/infiniband/core/umem.c), you can't do a straightforward folio
> conversion:
>
>                  pinned = pin_user_pages_fast(cur_base,
>                                            min_t(unsigned long, npages,
>                                                  PAGE_SIZE /
>                                                  sizeof(struct page *)),
>                                            gup_flags, page_list);
> ...
>                  ret = sg_alloc_append_table_from_pages(
>                          &umem->sgt_append, page_list, pinned, 0,
>                          pinned << PAGE_SHIFT, ib_dma_max_seg_size(device),
>                          npages, GFP_KERNEL);
>
> That can't be converted to folios.  The GUP might start in the middle of
> the folio, and we have no way to communicate that.
>
> This particular usage really needs the phyr work that Jason is doing so
> we can efficiently communicate physically contiguous ranges from GUP
> to sg.

Hi, Matthew

Thanks. To the following function, it seems that no folio function 
replace vmalloc_to_page.

vmalloc_to_page calls virt_to_page to get page. Finally the followings 
will be called.

"
(mem_map + ((pfn) - ARCH_PFN_OFFSET))

"

And I do not find the related folio functions with vmalloc_to_page.

And no folio function replaces dma_map_page.

dma_map_page will call dma_map_page_attrs.

Or these 2 function should not be replaced with folio functions?

int irdma_map_vm_page_list(struct irdma_hw *hw, void *va, dma_addr_t 
*pg_dma,

                            u32 pg_cnt)
{
         struct page *vm_page;
         int i;
         u8 *addr;

         addr = (u8 *)(uintptr_t)va;
         for (i = 0; i < pg_cnt; i++) {
                 vm_page = vmalloc_to_page(addr);
                 if (!vm_page)
                         goto err;

                 pg_dma[i] = dma_map_page(hw->device, vm_page, 0, PAGE_SIZE,
                                          DMA_BIDIRECTIONAL);
                 if (dma_mapping_error(hw->device, pg_dma[i]))
                         goto err;

                 addr += PAGE_SIZE;
         }

         return 0;

err:
         irdma_unmap_vm_page_list(hw, pg_dma, i);
         return -ENOMEM;

}

Thanks,

Zhu Yanjun


>> Another problem, after folio is used, I want to know the performance after
>> folio is implemented.
>>
>> How to make tests to get the performance?
> You know what you're working on ... I wouldn't know how best to test
> your code.
diff mbox series

Patch

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index ec46d8e8e49d..77df3d7b18a6 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -141,6 +141,30 @@  static inline void sg_set_page(struct scatterlist *sg, struct page *page,
 	sg->length = len;
 }
 
+/**
+ * sg_set_folio - Set sg entry to point at given folio
+ * @sg:		 SG entry
+ * @folio:	 The folio
+ * @len:	 Length of data
+ * @offset:	 Offset into folio
+ *
+ * Description:
+ *   Use this function to set an sg entry pointing at a folio, never assign
+ *   the folio directly. We encode sg table information in the lower bits
+ *   of the folio pointer. See sg_page() for looking up the page belonging
+ *   to an sg entry.
+ *
+ **/
+static inline void sg_set_folio(struct scatterlist *sg, struct folio *folio,
+			       size_t len, size_t offset)
+{
+	WARN_ON_ONCE(len > UINT_MAX);
+	WARN_ON_ONCE(offset > UINT_MAX);
+	sg_assign_page(sg, &folio->page);
+	sg->offset = offset;
+	sg->length = len;
+}
+
 static inline struct page *sg_page(struct scatterlist *sg)
 {
 #ifdef CONFIG_DEBUG_SG