diff mbox series

[1/9] mm: Introduce new vm_insert_range API

Message ID 20181115154530.GA27872@jordon-HP-15-Notebook-PC (mailing list archive)
State New, archived
Headers show
Series Use vm_insert_range | expand

Commit Message

Souptick Joarder Nov. 15, 2018, 3:45 p.m. UTC
Previouly drivers have their own way of mapping range of
kernel pages/memory into user vma and this was done by
invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can
be generalized by creating a new function and use it across
the drivers.

vm_insert_range is the new API which will be used to map a
range of kernel memory/pages to user vma.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
---
 include/linux/mm_types.h |  3 +++
 mm/memory.c              | 28 ++++++++++++++++++++++++++++
 mm/nommu.c               |  7 +++++++
 3 files changed, 38 insertions(+)

Comments

Randy Dunlap Nov. 15, 2018, 6:13 p.m. UTC | #1
On 11/15/18 7:45 AM, Souptick Joarder wrote:
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
> 
> As this pattern is common across different drivers, it can
> be generalized by creating a new function and use it across
> the drivers.
> 
> vm_insert_range is the new API which will be used to map a
> range of kernel memory/pages to user vma.
> 
> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> Reviewed-by: Matthew Wilcox <willy@infradead.org>
> ---
>  include/linux/mm_types.h |  3 +++
>  mm/memory.c              | 28 ++++++++++++++++++++++++++++
>  mm/nommu.c               |  7 +++++++
>  3 files changed, 38 insertions(+)

Hi,

What is the opposite of vm_insert_range() or even of vm_insert_page()?
or is there no need for that?


> diff --git a/mm/memory.c b/mm/memory.c
> index 15c417e..da904ed 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
>  }
>  
>  /**
> + * vm_insert_range - insert range of kernel pages into user vma
> + * @vma: user vma to map to
> + * @addr: target user address of this page
> + * @pages: pointer to array of source kernel pages
> + * @page_count: no. of pages need to insert into user vma

s/no./number/

> + *
> + * This allows drivers to insert range of kernel pages they've allocated
> + * into a user vma. This is a generic function which drivers can use
> + * rather than using their own way of mapping range of kernel pages into
> + * user vma.
> + */
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> +			struct page **pages, unsigned long page_count)
> +{
> +	unsigned long uaddr = addr;
> +	int ret = 0, i;
> +
> +	for (i = 0; i < page_count; i++) {
> +		ret = vm_insert_page(vma, uaddr, pages[i]);
> +		if (ret < 0)
> +			return ret;

For a non-trivial value of page_count:
Is it a problem if vm_insert_page() succeeds for several pages
and then fails?

> +		uaddr += PAGE_SIZE;
> +	}
> +
> +	return ret;
> +}
> +
> +/**
>   * vm_insert_page - insert single page into user vma
>   * @vma: user vma to map to
>   * @addr: target user address of this page


thanks.
Souptick Joarder Nov. 16, 2018, 5:30 a.m. UTC | #2
On Thu, Nov 15, 2018 at 11:44 PM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> On 11/15/18 7:45 AM, Souptick Joarder wrote:
> > Previouly drivers have their own way of mapping range of
> > kernel pages/memory into user vma and this was done by
> > invoking vm_insert_page() within a loop.
> >
> > As this pattern is common across different drivers, it can
> > be generalized by creating a new function and use it across
> > the drivers.
> >
> > vm_insert_range is the new API which will be used to map a
> > range of kernel memory/pages to user vma.
> >
> > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> > Reviewed-by: Matthew Wilcox <willy@infradead.org>
> > ---
> >  include/linux/mm_types.h |  3 +++
> >  mm/memory.c              | 28 ++++++++++++++++++++++++++++
> >  mm/nommu.c               |  7 +++++++
> >  3 files changed, 38 insertions(+)
>
> Hi,
>
> What is the opposite of vm_insert_range() or even of vm_insert_page()?
> or is there no need for that?

There is no opposite function of vm_insert_range() / vm_insert_page().
My understanding is, in case of any error, mmap handlers will return the
err to user process and user space will decide the next action. So next
time when mmap handler is getting invoked it will map from the beginning.
Correct me if I am wrong.
>
>
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 15c417e..da904ed 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
> >  }
> >
> >  /**
> > + * vm_insert_range - insert range of kernel pages into user vma
> > + * @vma: user vma to map to
> > + * @addr: target user address of this page
> > + * @pages: pointer to array of source kernel pages
> > + * @page_count: no. of pages need to insert into user vma
>
> s/no./number/

I didn't get it ??
>
> > + *
> > + * This allows drivers to insert range of kernel pages they've allocated
> > + * into a user vma. This is a generic function which drivers can use
> > + * rather than using their own way of mapping range of kernel pages into
> > + * user vma.
> > + */
> > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > +                     struct page **pages, unsigned long page_count)
> > +{
> > +     unsigned long uaddr = addr;
> > +     int ret = 0, i;
> > +
> > +     for (i = 0; i < page_count; i++) {
> > +             ret = vm_insert_page(vma, uaddr, pages[i]);
> > +             if (ret < 0)
> > +                     return ret;
>
> For a non-trivial value of page_count:
> Is it a problem if vm_insert_page() succeeds for several pages
> and then fails?

No, it will be considered as total failure and mmap handler will return
the err to user space.
>
> > +             uaddr += PAGE_SIZE;
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> > +/**
> >   * vm_insert_page - insert single page into user vma
> >   * @vma: user vma to map to
> >   * @addr: target user address of this page
>
>
> thanks.
> --
> ~Randy
Matthew Wilcox Nov. 16, 2018, 6:40 a.m. UTC | #3
On Fri, Nov 16, 2018 at 11:00:30AM +0530, Souptick Joarder wrote:
> On Thu, Nov 15, 2018 at 11:44 PM Randy Dunlap <rdunlap@infradead.org> wrote:
> > On 11/15/18 7:45 AM, Souptick Joarder wrote:
> > What is the opposite of vm_insert_range() or even of vm_insert_page()?
> > or is there no need for that?
> 
> There is no opposite function of vm_insert_range() / vm_insert_page().
> My understanding is, in case of any error, mmap handlers will return the
> err to user process and user space will decide the next action. So next
> time when mmap handler is getting invoked it will map from the beginning.
> Correct me if I am wrong.

The opposite function, I suppose, is unmap_region().

> > s/no./number/
> 
> I didn't get it ??

This is a 'sed' expression.  's' is the 'substitute' command; the /
is a separator, 'no.' is what you wrote, and 'number' is what Randy
is recommending instead.

> > > +     for (i = 0; i < page_count; i++) {
> > > +             ret = vm_insert_page(vma, uaddr, pages[i]);
> > > +             if (ret < 0)
> > > +                     return ret;
> >
> > For a non-trivial value of page_count:
> > Is it a problem if vm_insert_page() succeeds for several pages
> > and then fails?
> 
> No, it will be considered as total failure and mmap handler will return
> the err to user space.

I think what Randy means is "What happens to the inserted pages?" and
the answer is that mmap_region() jumps to the 'unmap_and_free_vma'
label, which is an accurate name.

Thanks for looking, Randy.
Souptick Joarder Nov. 16, 2018, 8:15 a.m. UTC | #4
On Fri, Nov 16, 2018 at 12:11 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Fri, Nov 16, 2018 at 11:00:30AM +0530, Souptick Joarder wrote:
> > On Thu, Nov 15, 2018 at 11:44 PM Randy Dunlap <rdunlap@infradead.org> wrote:
> > > On 11/15/18 7:45 AM, Souptick Joarder wrote:
> > > What is the opposite of vm_insert_range() or even of vm_insert_page()?
> > > or is there no need for that?
> >
> > There is no opposite function of vm_insert_range() / vm_insert_page().
> > My understanding is, in case of any error, mmap handlers will return the
> > err to user process and user space will decide the next action. So next
> > time when mmap handler is getting invoked it will map from the beginning.
> > Correct me if I am wrong.
>
> The opposite function, I suppose, is unmap_region().
>
> > > s/no./number/
> >
> > I didn't get it ??
>
> This is a 'sed' expression.  's' is the 'substitute' command; the /
> is a separator, 'no.' is what you wrote, and 'number' is what Randy
> is recommending instead.

Ok. Will change it in v2.
>
> > > > +     for (i = 0; i < page_count; i++) {
> > > > +             ret = vm_insert_page(vma, uaddr, pages[i]);
> > > > +             if (ret < 0)
> > > > +                     return ret;
> > >
> > > For a non-trivial value of page_count:
> > > Is it a problem if vm_insert_page() succeeds for several pages
> > > and then fails?
> >
> > No, it will be considered as total failure and mmap handler will return
> > the err to user space.
>
> I think what Randy means is "What happens to the inserted pages?" and
> the answer is that mmap_region() jumps to the 'unmap_and_free_vma'
> label, which is an accurate name.

Sorry for incorrect understanding of the question.
Randy Dunlap Nov. 16, 2018, 4:59 p.m. UTC | #5
On 11/16/18 12:15 AM, Souptick Joarder wrote:
> On Fri, Nov 16, 2018 at 12:11 PM Matthew Wilcox <willy@infradead.org> wrote:
>>
>> On Fri, Nov 16, 2018 at 11:00:30AM +0530, Souptick Joarder wrote:
>>> On Thu, Nov 15, 2018 at 11:44 PM Randy Dunlap <rdunlap@infradead.org> wrote:
>>>> On 11/15/18 7:45 AM, Souptick Joarder wrote:
>>>> What is the opposite of vm_insert_range() or even of vm_insert_page()?
>>>> or is there no need for that?
>>>
>>> There is no opposite function of vm_insert_range() / vm_insert_page().
>>> My understanding is, in case of any error, mmap handlers will return the
>>> err to user process and user space will decide the next action. So next
>>> time when mmap handler is getting invoked it will map from the beginning.
>>> Correct me if I am wrong.
>>
>> The opposite function, I suppose, is unmap_region().
>>
>>>> s/no./number/
>>>
>>> I didn't get it ??
>>
>> This is a 'sed' expression.  's' is the 'substitute' command; the /
>> is a separator, 'no.' is what you wrote, and 'number' is what Randy
>> is recommending instead.
> 
> Ok. Will change it in v2.

Thanks.

>>>>> +     for (i = 0; i < page_count; i++) {
>>>>> +             ret = vm_insert_page(vma, uaddr, pages[i]);
>>>>> +             if (ret < 0)
>>>>> +                     return ret;
>>>>
>>>> For a non-trivial value of page_count:
>>>> Is it a problem if vm_insert_page() succeeds for several pages
>>>> and then fails?
>>>
>>> No, it will be considered as total failure and mmap handler will return
>>> the err to user space.
>>
>> I think what Randy means is "What happens to the inserted pages?" and
>> the answer is that mmap_region() jumps to the 'unmap_and_free_vma'
>> label, which is an accurate name.

which says:

	/* Undo any partial mapping done by a device driver. */
	unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end);

and that is what I was missing.  Thanks.

> Sorry for incorrect understanding of the question.

No problem.
Slavomir Kaslev Nov. 16, 2018, 5:55 p.m. UTC | #6
On Thu, Nov 15, 2018 at 5:42 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
>
> As this pattern is common across different drivers, it can
> be generalized by creating a new function and use it across
> the drivers.
>
> vm_insert_range is the new API which will be used to map a
> range of kernel memory/pages to user vma.
>
> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> Reviewed-by: Matthew Wilcox <willy@infradead.org>
> ---
>  include/linux/mm_types.h |  3 +++
>  mm/memory.c              | 28 ++++++++++++++++++++++++++++
>  mm/nommu.c               |  7 +++++++
>  3 files changed, 38 insertions(+)
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5ed8f62..15ae24f 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -523,6 +523,9 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
>  extern void tlb_finish_mmu(struct mmu_gather *tlb,
>                                 unsigned long start, unsigned long end);
>
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> +                       struct page **pages, unsigned long page_count);
> +
>  static inline void init_tlb_flush_pending(struct mm_struct *mm)
>  {
>         atomic_set(&mm->tlb_flush_pending, 0);
> diff --git a/mm/memory.c b/mm/memory.c
> index 15c417e..da904ed 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
>  }
>
>  /**
> + * vm_insert_range - insert range of kernel pages into user vma
> + * @vma: user vma to map to
> + * @addr: target user address of this page
> + * @pages: pointer to array of source kernel pages
> + * @page_count: no. of pages need to insert into user vma
> + *
> + * This allows drivers to insert range of kernel pages they've allocated
> + * into a user vma. This is a generic function which drivers can use
> + * rather than using their own way of mapping range of kernel pages into
> + * user vma.
> + */
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> +                       struct page **pages, unsigned long page_count)
> +{
> +       unsigned long uaddr = addr;
> +       int ret = 0, i;
> +
> +       for (i = 0; i < page_count; i++) {
> +               ret = vm_insert_page(vma, uaddr, pages[i]);
> +               if (ret < 0)
> +                       return ret;
> +               uaddr += PAGE_SIZE;
> +       }
> +
> +       return ret;
> +}

+ EXPORT_SYMBOL(vm_insert_range);

> +
> +/**
>   * vm_insert_page - insert single page into user vma
>   * @vma: user vma to map to
>   * @addr: target user address of this page
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 749276b..d6ef5c7 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
>  }
>  EXPORT_SYMBOL(vm_insert_page);
>
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> +                       struct page **pages, unsigned long page_count)
> +{
> +       return -EINVAL;
> +}
> +EXPORT_SYMBOL(vm_insert_range);
> +
>  /*
>   *  sys_brk() for the most part doesn't need the global kernel
>   *  lock, except when an application is doing something nasty
> --
> 1.9.1
>
Mike Rapoport Nov. 16, 2018, 6:28 p.m. UTC | #7
On Thu, Nov 15, 2018 at 09:15:30PM +0530, Souptick Joarder wrote:
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
> 
> As this pattern is common across different drivers, it can
> be generalized by creating a new function and use it across
> the drivers.
> 
> vm_insert_range is the new API which will be used to map a
> range of kernel memory/pages to user vma.
> 
> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> Reviewed-by: Matthew Wilcox <willy@infradead.org>
> ---
>  include/linux/mm_types.h |  3 +++
>  mm/memory.c              | 28 ++++++++++++++++++++++++++++
>  mm/nommu.c               |  7 +++++++
>  3 files changed, 38 insertions(+)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 5ed8f62..15ae24f 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -523,6 +523,9 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
>  extern void tlb_finish_mmu(struct mmu_gather *tlb,
>  				unsigned long start, unsigned long end);
> 
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> +			struct page **pages, unsigned long page_count);
> +
>  static inline void init_tlb_flush_pending(struct mm_struct *mm)
>  {
>  	atomic_set(&mm->tlb_flush_pending, 0);
> diff --git a/mm/memory.c b/mm/memory.c
> index 15c417e..da904ed 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
>  }
> 
>  /**
> + * vm_insert_range - insert range of kernel pages into user vma
> + * @vma: user vma to map to
> + * @addr: target user address of this page
> + * @pages: pointer to array of source kernel pages
> + * @page_count: no. of pages need to insert into user vma
> + *
> + * This allows drivers to insert range of kernel pages they've allocated
> + * into a user vma. This is a generic function which drivers can use
> + * rather than using their own way of mapping range of kernel pages into
> + * user vma.

Please add the return value and context descriptions.

> + */
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> +			struct page **pages, unsigned long page_count)
> +{
> +	unsigned long uaddr = addr;
> +	int ret = 0, i;
> +
> +	for (i = 0; i < page_count; i++) {
> +		ret = vm_insert_page(vma, uaddr, pages[i]);
> +		if (ret < 0)
> +			return ret;
> +		uaddr += PAGE_SIZE;
> +	}
> +
> +	return ret;
> +}
> +
> +/**
>   * vm_insert_page - insert single page into user vma
>   * @vma: user vma to map to
>   * @addr: target user address of this page
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 749276b..d6ef5c7 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
>  }
>  EXPORT_SYMBOL(vm_insert_page);
> 
> +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> +			struct page **pages, unsigned long page_count)
> +{
> +	return -EINVAL;
> +}
> +EXPORT_SYMBOL(vm_insert_range);
> +
>  /*
>   *  sys_brk() for the most part doesn't need the global kernel
>   *  lock, except when an application is doing something nasty
> -- 
> 1.9.1
>
Souptick Joarder Nov. 17, 2018, 6:56 a.m. UTC | #8
On Fri, Nov 16, 2018 at 11:59 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> On Thu, Nov 15, 2018 at 09:15:30PM +0530, Souptick Joarder wrote:
> > Previouly drivers have their own way of mapping range of
> > kernel pages/memory into user vma and this was done by
> > invoking vm_insert_page() within a loop.
> >
> > As this pattern is common across different drivers, it can
> > be generalized by creating a new function and use it across
> > the drivers.
> >
> > vm_insert_range is the new API which will be used to map a
> > range of kernel memory/pages to user vma.
> >
> > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> > Reviewed-by: Matthew Wilcox <willy@infradead.org>
> > ---
> >  include/linux/mm_types.h |  3 +++
> >  mm/memory.c              | 28 ++++++++++++++++++++++++++++
> >  mm/nommu.c               |  7 +++++++
> >  3 files changed, 38 insertions(+)
> >
> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 5ed8f62..15ae24f 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -523,6 +523,9 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
> >  extern void tlb_finish_mmu(struct mmu_gather *tlb,
> >                               unsigned long start, unsigned long end);
> >
> > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > +                     struct page **pages, unsigned long page_count);
> > +
> >  static inline void init_tlb_flush_pending(struct mm_struct *mm)
> >  {
> >       atomic_set(&mm->tlb_flush_pending, 0);
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 15c417e..da904ed 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1478,6 +1478,34 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr,
> >  }
> >
> >  /**
> > + * vm_insert_range - insert range of kernel pages into user vma
> > + * @vma: user vma to map to
> > + * @addr: target user address of this page
> > + * @pages: pointer to array of source kernel pages
> > + * @page_count: no. of pages need to insert into user vma
> > + *
> > + * This allows drivers to insert range of kernel pages they've allocated
> > + * into a user vma. This is a generic function which drivers can use
> > + * rather than using their own way of mapping range of kernel pages into
> > + * user vma.
>
> Please add the return value and context descriptions.
>
> > + */
> > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > +                     struct page **pages, unsigned long page_count)
> > +{
> > +     unsigned long uaddr = addr;
> > +     int ret = 0, i;
> > +
> > +     for (i = 0; i < page_count; i++) {
> > +             ret = vm_insert_page(vma, uaddr, pages[i]);
> > +             if (ret < 0)
> > +                     return ret;
> > +             uaddr += PAGE_SIZE;
> > +     }
> > +
> > +     return ret;
> > +}
> > +
> > +/**
> >   * vm_insert_page - insert single page into user vma
> >   * @vma: user vma to map to
> >   * @addr: target user address of this page
> > diff --git a/mm/nommu.c b/mm/nommu.c
> > index 749276b..d6ef5c7 100644
> > --- a/mm/nommu.c
> > +++ b/mm/nommu.c
> > @@ -473,6 +473,13 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
> >  }
> >  EXPORT_SYMBOL(vm_insert_page);
> >
> > +int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
> > +                     struct page **pages, unsigned long page_count)
> > +{
> > +     return -EINVAL;
> > +}
> > +EXPORT_SYMBOL(vm_insert_range);
> > +
> >  /*
> >   *  sys_brk() for the most part doesn't need the global kernel
> >   *  lock, except when an application is doing something nasty
> > --
> > 1.9.1
> >
>
> --
> Sincerely yours,
> Mike.
>

Sure I will wait for some time to get additional review comments and
add all of those requested changes in v2.

Any further feedback on driver changes as part of this patch series ?
Matthew Wilcox Nov. 17, 2018, 2:37 p.m. UTC | #9
On Sat, Nov 17, 2018 at 12:26:38PM +0530, Souptick Joarder wrote:
> On Fri, Nov 16, 2018 at 11:59 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > + * vm_insert_range - insert range of kernel pages into user vma
> > > + * @vma: user vma to map to
> > > + * @addr: target user address of this page
> > > + * @pages: pointer to array of source kernel pages
> > > + * @page_count: no. of pages need to insert into user vma
> > > + *
> > > + * This allows drivers to insert range of kernel pages they've allocated
> > > + * into a user vma. This is a generic function which drivers can use
> > > + * rather than using their own way of mapping range of kernel pages into
> > > + * user vma.
> >
> > Please add the return value and context descriptions.
> >
> 
> Sure I will wait for some time to get additional review comments and
> add all of those requested changes in v2.

You could send your proposed wording now which might remove the need
for a v3 if we end up arguing about the wording.
Souptick Joarder Nov. 19, 2018, 3:13 p.m. UTC | #10
Hi Mike,

On Sat, Nov 17, 2018 at 8:07 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Sat, Nov 17, 2018 at 12:26:38PM +0530, Souptick Joarder wrote:
> > On Fri, Nov 16, 2018 at 11:59 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > > + * vm_insert_range - insert range of kernel pages into user vma
> > > > + * @vma: user vma to map to
> > > > + * @addr: target user address of this page
> > > > + * @pages: pointer to array of source kernel pages
> > > > + * @page_count: no. of pages need to insert into user vma
> > > > + *
> > > > + * This allows drivers to insert range of kernel pages they've allocated
> > > > + * into a user vma. This is a generic function which drivers can use
> > > > + * rather than using their own way of mapping range of kernel pages into
> > > > + * user vma.
> > >
> > > Please add the return value and context descriptions.
> > >
> >
> > Sure I will wait for some time to get additional review comments and
> > add all of those requested changes in v2.
>
> You could send your proposed wording now which might remove the need
> for a v3 if we end up arguing about the wording.

Does this description looks good ?

/**
 * vm_insert_range - insert range of kernel pages into user vma
 * @vma: user vma to map to
 * @addr: target user address of this page
 * @pages: pointer to array of source kernel pages
 * @page_count: number of pages need to insert into user vma
 *
 * This allows drivers to insert range of kernel pages they've allocated
 * into a user vma. This is a generic function which drivers can use
 * rather than using their own way of mapping range of kernel pages into
 * user vma.
 *
 * Context - Process context. Called by mmap handlers.
 * Return - int error value
 * 0                    - OK
 * -EINVAL              - Invalid argument
 * -ENOMEM              - No memory
 * -EFAULT              - Bad address
 * -EBUSY               - Device or resource busy
 */
Mike Rapoport Nov. 19, 2018, 4:26 p.m. UTC | #11
On Mon, Nov 19, 2018 at 08:43:09PM +0530, Souptick Joarder wrote:
> Hi Mike,
> 
> On Sat, Nov 17, 2018 at 8:07 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Sat, Nov 17, 2018 at 12:26:38PM +0530, Souptick Joarder wrote:
> > > On Fri, Nov 16, 2018 at 11:59 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > > > + * vm_insert_range - insert range of kernel pages into user vma
> > > > > + * @vma: user vma to map to
> > > > > + * @addr: target user address of this page
> > > > > + * @pages: pointer to array of source kernel pages
> > > > > + * @page_count: no. of pages need to insert into user vma
> > > > > + *
> > > > > + * This allows drivers to insert range of kernel pages they've allocated
> > > > > + * into a user vma. This is a generic function which drivers can use
> > > > > + * rather than using their own way of mapping range of kernel pages into
> > > > > + * user vma.
> > > >
> > > > Please add the return value and context descriptions.
> > > >
> > >
> > > Sure I will wait for some time to get additional review comments and
> > > add all of those requested changes in v2.
> >
> > You could send your proposed wording now which might remove the need
> > for a v3 if we end up arguing about the wording.
> 
> Does this description looks good ?
> 
> /**
>  * vm_insert_range - insert range of kernel pages into user vma
>  * @vma: user vma to map to
>  * @addr: target user address of this page
>  * @pages: pointer to array of source kernel pages
>  * @page_count: number of pages need to insert into user vma
>  *
>  * This allows drivers to insert range of kernel pages they've allocated
>  * into a user vma. This is a generic function which drivers can use
>  * rather than using their own way of mapping range of kernel pages into
>  * user vma.
>  *
>  * Context - Process context. Called by mmap handlers.

Context:

>  * Return - int error value

Return:

>  * 0                    - OK
>  * -EINVAL              - Invalid argument
>  * -ENOMEM              - No memory
>  * -EFAULT              - Bad address
>  * -EBUSY               - Device or resource busy

I don't think that elaborate description of error values is needed, just "0
on success and error code otherwise" would be sufficient.

>  */
>
Souptick Joarder Nov. 19, 2018, 5:45 p.m. UTC | #12
On Mon, Nov 19, 2018 at 9:56 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> On Mon, Nov 19, 2018 at 08:43:09PM +0530, Souptick Joarder wrote:
> > Hi Mike,
> >
> > On Sat, Nov 17, 2018 at 8:07 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Sat, Nov 17, 2018 at 12:26:38PM +0530, Souptick Joarder wrote:
> > > > On Fri, Nov 16, 2018 at 11:59 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > > > > + * vm_insert_range - insert range of kernel pages into user vma
> > > > > > + * @vma: user vma to map to
> > > > > > + * @addr: target user address of this page
> > > > > > + * @pages: pointer to array of source kernel pages
> > > > > > + * @page_count: no. of pages need to insert into user vma
> > > > > > + *
> > > > > > + * This allows drivers to insert range of kernel pages they've allocated
> > > > > > + * into a user vma. This is a generic function which drivers can use
> > > > > > + * rather than using their own way of mapping range of kernel pages into
> > > > > > + * user vma.
> > > > >
> > > > > Please add the return value and context descriptions.
> > > > >
> > > >
> > > > Sure I will wait for some time to get additional review comments and
> > > > add all of those requested changes in v2.
> > >
> > > You could send your proposed wording now which might remove the need
> > > for a v3 if we end up arguing about the wording.
> >
> > Does this description looks good ?
> >
> > /**
> >  * vm_insert_range - insert range of kernel pages into user vma
> >  * @vma: user vma to map to
> >  * @addr: target user address of this page
> >  * @pages: pointer to array of source kernel pages
> >  * @page_count: number of pages need to insert into user vma
> >  *
> >  * This allows drivers to insert range of kernel pages they've allocated
> >  * into a user vma. This is a generic function which drivers can use
> >  * rather than using their own way of mapping range of kernel pages into
> >  * user vma.
> >  *
> >  * Context - Process context. Called by mmap handlers.
>
> Context:
>
> >  * Return - int error value
>
> Return:
>
> >  * 0                    - OK
> >  * -EINVAL              - Invalid argument
> >  * -ENOMEM              - No memory
> >  * -EFAULT              - Bad address
> >  * -EBUSY               - Device or resource busy
>
> I don't think that elaborate description of error values is needed, just "0
> on success and error code otherwise" would be sufficient.

/**
 * vm_insert_range - insert range of kernel pages into user vma
 * @vma: user vma to map to
 * @addr: target user address of this page
 * @pages: pointer to array of source kernel pages
 * @page_count: number of pages need to insert into user vma
 *
 * This allows drivers to insert range of kernel pages they've allocated
 * into a user vma. This is a generic function which drivers can use
 * rather than using their own way of mapping range of kernel pages into
 * user vma.
 *
 * Context: Process context. Called by mmap handlers.
 * Return: 0 on success and error code otherwise
 */
William Kucharski Nov. 21, 2018, 11:19 a.m. UTC | #13
Could you add a line to the description explicitly stating that a failure
to insert any page in the range will fail the entire routine, something
like:

> * This allows drivers to insert range of kernel pages they've allocated
> * into a user vma. This is a generic function which drivers can use
> * rather than using their own way of mapping range of kernel pages into
> * user vma.
> *
> * A failure to insert any page in the range will fail the call as a whole.

It's obvious when reading the code, but it would be self-documenting to
state it outright.

Thanks!
    -- Bill
Matthew Wilcox Nov. 21, 2018, 12:35 p.m. UTC | #14
On Wed, Nov 21, 2018 at 04:19:11AM -0700, William Kucharski wrote:
> Could you add a line to the description explicitly stating that a failure
> to insert any page in the range will fail the entire routine, something
> like:
> 
> > * This allows drivers to insert range of kernel pages they've allocated
> > * into a user vma. This is a generic function which drivers can use
> > * rather than using their own way of mapping range of kernel pages into
> > * user vma.
> > *
> > * A failure to insert any page in the range will fail the call as a whole.
> 
> It's obvious when reading the code, but it would be self-documenting to
> state it outright.

It's probably better to be more explicit and answer Randy's question:

 * If we fail to insert any page into the vma, the function will return
 * immediately leaving any previously-inserted pages present.  Callers
 * from the mmap handler may immediately return the error as their
 * caller will destroy the vma, removing any successfully-inserted pages.
 * Other callers should make their own arrangements for calling unmap_region().

Although unmap_region() is static so there clearly isn't any code in the
kernel today other than in mmap handlers (or fault handlers) that needs to
insert pages into a VMA.
William Kucharski Nov. 21, 2018, 2:29 p.m. UTC | #15
> On Nov 21, 2018, at 5:35 AM, Matthew Wilcox <willy@infradead.org> wrote:
> 
> It's probably better to be more explicit and answer Randy's question:
> 
> * If we fail to insert any page into the vma, the function will return
> * immediately leaving any previously-inserted pages present.  Callers
> * from the mmap handler may immediately return the error as their
> * caller will destroy the vma, removing any successfully-inserted pages.
> * Other callers should make their own arrangements for calling unmap_region().

That works for me as well.
Mike Rapoport Nov. 23, 2018, 7:19 a.m. UTC | #16
On Mon, Nov 19, 2018 at 11:15:15PM +0530, Souptick Joarder wrote:
> On Mon, Nov 19, 2018 at 9:56 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> >
> > On Mon, Nov 19, 2018 at 08:43:09PM +0530, Souptick Joarder wrote:
> > > Hi Mike,
> > >
> > > On Sat, Nov 17, 2018 at 8:07 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > >
> > > > On Sat, Nov 17, 2018 at 12:26:38PM +0530, Souptick Joarder wrote:
> > > > > On Fri, Nov 16, 2018 at 11:59 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > > > > > + * vm_insert_range - insert range of kernel pages into user vma
> > > > > > > + * @vma: user vma to map to
> > > > > > > + * @addr: target user address of this page
> > > > > > > + * @pages: pointer to array of source kernel pages
> > > > > > > + * @page_count: no. of pages need to insert into user vma
> > > > > > > + *
> > > > > > > + * This allows drivers to insert range of kernel pages they've allocated
> > > > > > > + * into a user vma. This is a generic function which drivers can use
> > > > > > > + * rather than using their own way of mapping range of kernel pages into
> > > > > > > + * user vma.
> > > > > >
> > > > > > Please add the return value and context descriptions.
> > > > > >
> > > > >
> > > > > Sure I will wait for some time to get additional review comments and
> > > > > add all of those requested changes in v2.
> > > >
> > > > You could send your proposed wording now which might remove the need
> > > > for a v3 if we end up arguing about the wording.
> > >
> > > Does this description looks good ?
> > >
> > > /**
> > >  * vm_insert_range - insert range of kernel pages into user vma
> > >  * @vma: user vma to map to
> > >  * @addr: target user address of this page
> > >  * @pages: pointer to array of source kernel pages
> > >  * @page_count: number of pages need to insert into user vma
> > >  *
> > >  * This allows drivers to insert range of kernel pages they've allocated
> > >  * into a user vma. This is a generic function which drivers can use
> > >  * rather than using their own way of mapping range of kernel pages into
> > >  * user vma.
> > >  *
> > >  * Context - Process context. Called by mmap handlers.
> >
> > Context:
> >
> > >  * Return - int error value
> >
> > Return:
> >
> > >  * 0                    - OK
> > >  * -EINVAL              - Invalid argument
> > >  * -ENOMEM              - No memory
> > >  * -EFAULT              - Bad address
> > >  * -EBUSY               - Device or resource busy
> >
> > I don't think that elaborate description of error values is needed, just "0
> > on success and error code otherwise" would be sufficient.
> 
> /**
>  * vm_insert_range - insert range of kernel pages into user vma
>  * @vma: user vma to map to
>  * @addr: target user address of this page
>  * @pages: pointer to array of source kernel pages
>  * @page_count: number of pages need to insert into user vma
>  *
>  * This allows drivers to insert range of kernel pages they've allocated
>  * into a user vma. This is a generic function which drivers can use
>  * rather than using their own way of mapping range of kernel pages into
>  * user vma.
>  *
>  * Context: Process context. Called by mmap handlers.
>  * Return: 0 on success and error code otherwise
>  */

Looks good to me.
Heiko Stübner Nov. 28, 2018, 3:21 p.m. UTC | #17
Am Donnerstag, 15. November 2018, 16:45:30 CET schrieb Souptick Joarder:
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
> 
> As this pattern is common across different drivers, it can
> be generalized by creating a new function and use it across
> the drivers.
> 
> vm_insert_range is the new API which will be used to map a
> range of kernel memory/pages to user vma.
> 
> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> Reviewed-by: Matthew Wilcox <willy@infradead.org>

Except the missing EXPORT_SYMBOL for module builds this new
API is supposed to run also within the Rockchip drm driver, so
on rk3188, rk3288, rk3328 and rk3399 with graphics
Tested-by: Heiko Stuebner <heiko@sntech.de>
diff mbox series

Patch

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5ed8f62..15ae24f 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -523,6 +523,9 @@  extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
 extern void tlb_finish_mmu(struct mmu_gather *tlb,
 				unsigned long start, unsigned long end);
 
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+			struct page **pages, unsigned long page_count);
+
 static inline void init_tlb_flush_pending(struct mm_struct *mm)
 {
 	atomic_set(&mm->tlb_flush_pending, 0);
diff --git a/mm/memory.c b/mm/memory.c
index 15c417e..da904ed 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1478,6 +1478,34 @@  static int insert_page(struct vm_area_struct *vma, unsigned long addr,
 }
 
 /**
+ * vm_insert_range - insert range of kernel pages into user vma
+ * @vma: user vma to map to
+ * @addr: target user address of this page
+ * @pages: pointer to array of source kernel pages
+ * @page_count: no. of pages need to insert into user vma
+ *
+ * This allows drivers to insert range of kernel pages they've allocated
+ * into a user vma. This is a generic function which drivers can use
+ * rather than using their own way of mapping range of kernel pages into
+ * user vma.
+ */
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+			struct page **pages, unsigned long page_count)
+{
+	unsigned long uaddr = addr;
+	int ret = 0, i;
+
+	for (i = 0; i < page_count; i++) {
+		ret = vm_insert_page(vma, uaddr, pages[i]);
+		if (ret < 0)
+			return ret;
+		uaddr += PAGE_SIZE;
+	}
+
+	return ret;
+}
+
+/**
  * vm_insert_page - insert single page into user vma
  * @vma: user vma to map to
  * @addr: target user address of this page
diff --git a/mm/nommu.c b/mm/nommu.c
index 749276b..d6ef5c7 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -473,6 +473,13 @@  int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
 }
 EXPORT_SYMBOL(vm_insert_page);
 
+int vm_insert_range(struct vm_area_struct *vma, unsigned long addr,
+			struct page **pages, unsigned long page_count)
+{
+	return -EINVAL;
+}
+EXPORT_SYMBOL(vm_insert_range);
+
 /*
  *  sys_brk() for the most part doesn't need the global kernel
  *  lock, except when an application is doing something nasty