diff mbox series

[3/7] mm/gup: remove vmas parameter from get_user_pages_remote()

Message ID 5a4cf1ebf1c6cdfabbf2f5209facb0180dd20006.1681508038.git.lstoakes@gmail.com (mailing list archive)
State Mainlined, archived
Headers show
Series None | expand

Commit Message

Lorenzo Stoakes April 14, 2023, 11:27 p.m. UTC
The only instances of get_user_pages_remote() invocations which used the
vmas parameter were for a single page which can instead simply look up the
VMA directly. In particular:-

- __update_ref_ctr() looked up the VMA but did nothing with it so we simply
  remove it.

- __access_remote_vm() was already using vma_lookup() when the original
  lookup failed so by doing the lookup directly this also de-duplicates the
  code.

This forms part of a broader set of patches intended to eliminate the vmas
parameter altogether.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 arch/arm64/kernel/mte.c   |  5 +++--
 arch/s390/kvm/interrupt.c |  2 +-
 fs/exec.c                 |  2 +-
 include/linux/mm.h        |  2 +-
 kernel/events/uprobes.c   | 10 +++++-----
 mm/gup.c                  | 12 ++++--------
 mm/memory.c               |  9 +++++----
 mm/rmap.c                 |  2 +-
 security/tomoyo/domain.c  |  2 +-
 virt/kvm/async_pf.c       |  3 +--
 10 files changed, 23 insertions(+), 26 deletions(-)

Comments

Tetsuo Handa April 15, 2023, 12:25 a.m. UTC | #1
On 2023/04/15 8:27, Lorenzo Stoakes wrote:
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index f5bcb0dc6267..74d8d4007dec 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
>  		struct page *page = NULL;
>  
>  		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> -					    &vma, NULL);
> -		if (ret <= 0)
> +					    NULL);
> +		vma = vma_lookup(mm, addr);
> +		if (ret <= 0 || !vma)
>  			break;

This conversion looks wrong. When get_user_pages_remote(&page) returned > 0,
put_page(page) is needed even if vma_lookup() returned NULL, isn't it?
Jason Gunthorpe April 17, 2023, 1:09 p.m. UTC | #2
On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> The only instances of get_user_pages_remote() invocations which used the
> vmas parameter were for a single page which can instead simply look up the
> VMA directly. In particular:-
> 
> - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
>   remove it.
> 
> - __access_remote_vm() was already using vma_lookup() when the original
>   lookup failed so by doing the lookup directly this also de-duplicates the
>   code.
> 
> This forms part of a broader set of patches intended to eliminate the vmas
> parameter altogether.
> 
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> ---
>  arch/arm64/kernel/mte.c   |  5 +++--
>  arch/s390/kvm/interrupt.c |  2 +-
>  fs/exec.c                 |  2 +-
>  include/linux/mm.h        |  2 +-
>  kernel/events/uprobes.c   | 10 +++++-----
>  mm/gup.c                  | 12 ++++--------
>  mm/memory.c               |  9 +++++----
>  mm/rmap.c                 |  2 +-
>  security/tomoyo/domain.c  |  2 +-
>  virt/kvm/async_pf.c       |  3 +--
>  10 files changed, 23 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index f5bcb0dc6267..74d8d4007dec 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
>  		struct page *page = NULL;
>  
>  		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> -					    &vma, NULL);
> -		if (ret <= 0)
> +					    NULL);
> +		vma = vma_lookup(mm, addr);
> +		if (ret <= 0 || !vma)
>  			break;

Given the slightly tricky error handling, it would make sense to turn
this pattern into a helper function:

page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
if (IS_ERR(page))
  [..]

static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
   unsigned long addr, int gup_flags, struct vm_area_struct **vma)
{
	struct page *page;
	int ret;

	ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
	if (ret < 0)
	   return ERR_PTR(ret);
	if (WARN_ON(ret == 0))
	   return ERR_PTR(-EINVAL);
        *vma = vma_lookup(mm, addr);
	if (WARN_ON(!*vma) {
	   put_user_page(page);
	   return ERR_PTR(-EINVAL);
        }
	return page;
}

It could be its own patch so this change was just a mechanical removal
of NULL

Jason
Lorenzo Stoakes April 17, 2023, 1:13 p.m. UTC | #3
On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> > The only instances of get_user_pages_remote() invocations which used the
> > vmas parameter were for a single page which can instead simply look up the
> > VMA directly. In particular:-
> >
> > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> >   remove it.
> >
> > - __access_remote_vm() was already using vma_lookup() when the original
> >   lookup failed so by doing the lookup directly this also de-duplicates the
> >   code.
> >
> > This forms part of a broader set of patches intended to eliminate the vmas
> > parameter altogether.
> >
> > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > ---
> >  arch/arm64/kernel/mte.c   |  5 +++--
> >  arch/s390/kvm/interrupt.c |  2 +-
> >  fs/exec.c                 |  2 +-
> >  include/linux/mm.h        |  2 +-
> >  kernel/events/uprobes.c   | 10 +++++-----
> >  mm/gup.c                  | 12 ++++--------
> >  mm/memory.c               |  9 +++++----
> >  mm/rmap.c                 |  2 +-
> >  security/tomoyo/domain.c  |  2 +-
> >  virt/kvm/async_pf.c       |  3 +--
> >  10 files changed, 23 insertions(+), 26 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > index f5bcb0dc6267..74d8d4007dec 100644
> > --- a/arch/arm64/kernel/mte.c
> > +++ b/arch/arm64/kernel/mte.c
> > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> >  		struct page *page = NULL;
> >
> >  		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> > -					    &vma, NULL);
> > -		if (ret <= 0)
> > +					    NULL);
> > +		vma = vma_lookup(mm, addr);
> > +		if (ret <= 0 || !vma)
> >  			break;
>
> Given the slightly tricky error handling, it would make sense to turn
> this pattern into a helper function:
>
> page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> if (IS_ERR(page))
>   [..]
>
> static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
>    unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> {
> 	struct page *page;
> 	int ret;
>
> 	ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> 	if (ret < 0)
> 	   return ERR_PTR(ret);
> 	if (WARN_ON(ret == 0))
> 	   return ERR_PTR(-EINVAL);
>         *vma = vma_lookup(mm, addr);
> 	if (WARN_ON(!*vma) {
> 	   put_user_page(page);
> 	   return ERR_PTR(-EINVAL);
>         }
> 	return page;
> }
>
> It could be its own patch so this change was just a mechanical removal
> of NULL
>
> Jason
>

Agreed, I think this would work better as a follow up patch however so as
not to distract too much from the core change. I feel like there are quite
a few things we can follow up on including assessing whether we might be
able to use _fast() paths in places (I haven't assessed this yet).
Jason Gunthorpe April 17, 2023, 1:16 p.m. UTC | #4
On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
> On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> > > The only instances of get_user_pages_remote() invocations which used the
> > > vmas parameter were for a single page which can instead simply look up the
> > > VMA directly. In particular:-
> > >
> > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> > >   remove it.
> > >
> > > - __access_remote_vm() was already using vma_lookup() when the original
> > >   lookup failed so by doing the lookup directly this also de-duplicates the
> > >   code.
> > >
> > > This forms part of a broader set of patches intended to eliminate the vmas
> > > parameter altogether.
> > >
> > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > > ---
> > >  arch/arm64/kernel/mte.c   |  5 +++--
> > >  arch/s390/kvm/interrupt.c |  2 +-
> > >  fs/exec.c                 |  2 +-
> > >  include/linux/mm.h        |  2 +-
> > >  kernel/events/uprobes.c   | 10 +++++-----
> > >  mm/gup.c                  | 12 ++++--------
> > >  mm/memory.c               |  9 +++++----
> > >  mm/rmap.c                 |  2 +-
> > >  security/tomoyo/domain.c  |  2 +-
> > >  virt/kvm/async_pf.c       |  3 +--
> > >  10 files changed, 23 insertions(+), 26 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > > index f5bcb0dc6267..74d8d4007dec 100644
> > > --- a/arch/arm64/kernel/mte.c
> > > +++ b/arch/arm64/kernel/mte.c
> > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> > >  		struct page *page = NULL;
> > >
> > >  		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> > > -					    &vma, NULL);
> > > -		if (ret <= 0)
> > > +					    NULL);
> > > +		vma = vma_lookup(mm, addr);
> > > +		if (ret <= 0 || !vma)
> > >  			break;
> >
> > Given the slightly tricky error handling, it would make sense to turn
> > this pattern into a helper function:
> >
> > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> > if (IS_ERR(page))
> >   [..]
> >
> > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
> >    unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> > {
> > 	struct page *page;
> > 	int ret;
> >
> > 	ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> > 	if (ret < 0)
> > 	   return ERR_PTR(ret);
> > 	if (WARN_ON(ret == 0))
> > 	   return ERR_PTR(-EINVAL);
> >         *vma = vma_lookup(mm, addr);
> > 	if (WARN_ON(!*vma) {
> > 	   put_user_page(page);
> > 	   return ERR_PTR(-EINVAL);
> >         }
> > 	return page;
> > }
> >
> > It could be its own patch so this change was just a mechanical removal
> > of NULL
> >
> > Jason
> >
> 
> Agreed, I think this would work better as a follow up patch however so as
> not to distract too much from the core change. 

I don't think you should open code sketchy error handling in several
places and then clean it up later. Just do it right from the start.

Jason
Lorenzo Stoakes April 17, 2023, 1:23 p.m. UTC | #5
On Mon, Apr 17, 2023 at 10:16:28AM -0300, Jason Gunthorpe wrote:
> On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
> > On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> > > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> > > > The only instances of get_user_pages_remote() invocations which used the
> > > > vmas parameter were for a single page which can instead simply look up the
> > > > VMA directly. In particular:-
> > > >
> > > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> > > >   remove it.
> > > >
> > > > - __access_remote_vm() was already using vma_lookup() when the original
> > > >   lookup failed so by doing the lookup directly this also de-duplicates the
> > > >   code.
> > > >
> > > > This forms part of a broader set of patches intended to eliminate the vmas
> > > > parameter altogether.
> > > >
> > > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > > > ---
> > > >  arch/arm64/kernel/mte.c   |  5 +++--
> > > >  arch/s390/kvm/interrupt.c |  2 +-
> > > >  fs/exec.c                 |  2 +-
> > > >  include/linux/mm.h        |  2 +-
> > > >  kernel/events/uprobes.c   | 10 +++++-----
> > > >  mm/gup.c                  | 12 ++++--------
> > > >  mm/memory.c               |  9 +++++----
> > > >  mm/rmap.c                 |  2 +-
> > > >  security/tomoyo/domain.c  |  2 +-
> > > >  virt/kvm/async_pf.c       |  3 +--
> > > >  10 files changed, 23 insertions(+), 26 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > > > index f5bcb0dc6267..74d8d4007dec 100644
> > > > --- a/arch/arm64/kernel/mte.c
> > > > +++ b/arch/arm64/kernel/mte.c
> > > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> > > >  		struct page *page = NULL;
> > > >
> > > >  		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> > > > -					    &vma, NULL);
> > > > -		if (ret <= 0)
> > > > +					    NULL);
> > > > +		vma = vma_lookup(mm, addr);
> > > > +		if (ret <= 0 || !vma)
> > > >  			break;
> > >
> > > Given the slightly tricky error handling, it would make sense to turn
> > > this pattern into a helper function:
> > >
> > > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> > > if (IS_ERR(page))
> > >   [..]
> > >
> > > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
> > >    unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> > > {
> > > 	struct page *page;
> > > 	int ret;
> > >
> > > 	ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> > > 	if (ret < 0)
> > > 	   return ERR_PTR(ret);
> > > 	if (WARN_ON(ret == 0))
> > > 	   return ERR_PTR(-EINVAL);
> > >         *vma = vma_lookup(mm, addr);
> > > 	if (WARN_ON(!*vma) {
> > > 	   put_user_page(page);
> > > 	   return ERR_PTR(-EINVAL);
> > >         }
> > > 	return page;
> > > }
> > >
> > > It could be its own patch so this change was just a mechanical removal
> > > of NULL
> > >
> > > Jason
> > >
> >
> > Agreed, I think this would work better as a follow up patch however so as
> > not to distract too much from the core change.
>
> I don't think you should open code sketchy error handling in several
> places and then clean it up later. Just do it right from the start.
>

Intent was to do smallest change possible (though through review that grew
of course), but I see your point, in this instance this is fiddly stuff and
probably better to abstract it to enforce correct handling.

I'll respin + add something like this.

> Jason
Eric W. Biederman April 17, 2023, 3:07 p.m. UTC | #6
Lorenzo Stoakes <lstoakes@gmail.com> writes:

> On Mon, Apr 17, 2023 at 10:16:28AM -0300, Jason Gunthorpe wrote:
>> On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
>> > On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
>> > > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
>> > > > The only instances of get_user_pages_remote() invocations which used the
>> > > > vmas parameter were for a single page which can instead simply look up the
>> > > > VMA directly. In particular:-
>> > > >
>> > > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
>> > > >   remove it.
>> > > >
>> > > > - __access_remote_vm() was already using vma_lookup() when the original
>> > > >   lookup failed so by doing the lookup directly this also de-duplicates the
>> > > >   code.
>> > > >
>> > > > This forms part of a broader set of patches intended to eliminate the vmas
>> > > > parameter altogether.
>> > > >
>> > > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
>> > > > ---
>> > > >  arch/arm64/kernel/mte.c   |  5 +++--
>> > > >  arch/s390/kvm/interrupt.c |  2 +-
>> > > >  fs/exec.c                 |  2 +-
>> > > >  include/linux/mm.h        |  2 +-
>> > > >  kernel/events/uprobes.c   | 10 +++++-----
>> > > >  mm/gup.c                  | 12 ++++--------
>> > > >  mm/memory.c               |  9 +++++----
>> > > >  mm/rmap.c                 |  2 +-
>> > > >  security/tomoyo/domain.c  |  2 +-
>> > > >  virt/kvm/async_pf.c       |  3 +--
>> > > >  10 files changed, 23 insertions(+), 26 deletions(-)
>> > > >
>> > > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
>> > > > index f5bcb0dc6267..74d8d4007dec 100644
>> > > > --- a/arch/arm64/kernel/mte.c
>> > > > +++ b/arch/arm64/kernel/mte.c
>> > > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
>> > > >  		struct page *page = NULL;
>> > > >
>> > > >  		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
>> > > > -					    &vma, NULL);
>> > > > -		if (ret <= 0)
>> > > > +					    NULL);
>> > > > +		vma = vma_lookup(mm, addr);
>> > > > +		if (ret <= 0 || !vma)
>> > > >  			break;
>> > >
>> > > Given the slightly tricky error handling, it would make sense to turn
>> > > this pattern into a helper function:
>> > >
>> > > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
>> > > if (IS_ERR(page))
>> > >   [..]
>> > >
>> > > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
>> > >    unsigned long addr, int gup_flags, struct vm_area_struct **vma)
>> > > {
>> > > 	struct page *page;
>> > > 	int ret;
>> > >
>> > > 	ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
>> > > 	if (ret < 0)
>> > > 	   return ERR_PTR(ret);
>> > > 	if (WARN_ON(ret == 0))
>> > > 	   return ERR_PTR(-EINVAL);
>> > >         *vma = vma_lookup(mm, addr);
>> > > 	if (WARN_ON(!*vma) {
>> > > 	   put_user_page(page);
>> > > 	   return ERR_PTR(-EINVAL);
>> > >         }
>> > > 	return page;
>> > > }
>> > >
>> > > It could be its own patch so this change was just a mechanical removal
>> > > of NULL
>> > >
>> > > Jason
>> > >
>> >
>> > Agreed, I think this would work better as a follow up patch however so as
>> > not to distract too much from the core change.
>>
>> I don't think you should open code sketchy error handling in several
>> places and then clean it up later. Just do it right from the start.
>>
>
> Intent was to do smallest change possible (though through review that grew
> of course), but I see your point, in this instance this is fiddly stuff and
> probably better to abstract it to enforce correct handling.
>
> I'll respin + add something like this.

Could you include in your description why looking up the vma after
getting the page does not introduce a race?

I am probably silly and just looking at this quickly but it does not
seem immediately obvious why the vma and the page should match.

I would not be surprised if you hold the appropriate mutex over the
entire operation but it just isn't apparent from the diff.

I am concerned because it is an easy mistake to refactor something into
two steps and then discover you have introduced a race.

Eric
Lorenzo Stoakes April 17, 2023, 3:14 p.m. UTC | #7
On Mon, Apr 17, 2023 at 10:07:53AM -0500, Eric W. Biederman wrote:
> Lorenzo Stoakes <lstoakes@gmail.com> writes:
>
> > On Mon, Apr 17, 2023 at 10:16:28AM -0300, Jason Gunthorpe wrote:
> >> On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
> >> > On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> >> > > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> >> > > > The only instances of get_user_pages_remote() invocations which used the
> >> > > > vmas parameter were for a single page which can instead simply look up the
> >> > > > VMA directly. In particular:-
> >> > > >
> >> > > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> >> > > >   remove it.
> >> > > >
> >> > > > - __access_remote_vm() was already using vma_lookup() when the original
> >> > > >   lookup failed so by doing the lookup directly this also de-duplicates the
> >> > > >   code.
> >> > > >
> >> > > > This forms part of a broader set of patches intended to eliminate the vmas
> >> > > > parameter altogether.
> >> > > >
> >> > > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> >> > > > ---
> >> > > >  arch/arm64/kernel/mte.c   |  5 +++--
> >> > > >  arch/s390/kvm/interrupt.c |  2 +-
> >> > > >  fs/exec.c                 |  2 +-
> >> > > >  include/linux/mm.h        |  2 +-
> >> > > >  kernel/events/uprobes.c   | 10 +++++-----
> >> > > >  mm/gup.c                  | 12 ++++--------
> >> > > >  mm/memory.c               |  9 +++++----
> >> > > >  mm/rmap.c                 |  2 +-
> >> > > >  security/tomoyo/domain.c  |  2 +-
> >> > > >  virt/kvm/async_pf.c       |  3 +--
> >> > > >  10 files changed, 23 insertions(+), 26 deletions(-)
> >> > > >
> >> > > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> >> > > > index f5bcb0dc6267..74d8d4007dec 100644
> >> > > > --- a/arch/arm64/kernel/mte.c
> >> > > > +++ b/arch/arm64/kernel/mte.c
> >> > > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> >> > > >  		struct page *page = NULL;
> >> > > >
> >> > > >  		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> >> > > > -					    &vma, NULL);
> >> > > > -		if (ret <= 0)
> >> > > > +					    NULL);
> >> > > > +		vma = vma_lookup(mm, addr);
> >> > > > +		if (ret <= 0 || !vma)
> >> > > >  			break;
> >> > >
> >> > > Given the slightly tricky error handling, it would make sense to turn
> >> > > this pattern into a helper function:
> >> > >
> >> > > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> >> > > if (IS_ERR(page))
> >> > >   [..]
> >> > >
> >> > > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
> >> > >    unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> >> > > {
> >> > > 	struct page *page;
> >> > > 	int ret;
> >> > >
> >> > > 	ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> >> > > 	if (ret < 0)
> >> > > 	   return ERR_PTR(ret);
> >> > > 	if (WARN_ON(ret == 0))
> >> > > 	   return ERR_PTR(-EINVAL);
> >> > >         *vma = vma_lookup(mm, addr);
> >> > > 	if (WARN_ON(!*vma) {
> >> > > 	   put_user_page(page);
> >> > > 	   return ERR_PTR(-EINVAL);
> >> > >         }
> >> > > 	return page;
> >> > > }
> >> > >
> >> > > It could be its own patch so this change was just a mechanical removal
> >> > > of NULL
> >> > >
> >> > > Jason
> >> > >
> >> >
> >> > Agreed, I think this would work better as a follow up patch however so as
> >> > not to distract too much from the core change.
> >>
> >> I don't think you should open code sketchy error handling in several
> >> places and then clean it up later. Just do it right from the start.
> >>
> >
> > Intent was to do smallest change possible (though through review that grew
> > of course), but I see your point, in this instance this is fiddly stuff and
> > probably better to abstract it to enforce correct handling.
> >
> > I'll respin + add something like this.
>
> Could you include in your description why looking up the vma after
> getting the page does not introduce a race?
>
> I am probably silly and just looking at this quickly but it does not
> seem immediately obvious why the vma and the page should match.
>
> I would not be surprised if you hold the appropriate mutex over the
> entire operation but it just isn't apparent from the diff.
>
> I am concerned because it is an easy mistake to refactor something into
> two steps and then discover you have introduced a race.
>
> Eric
>

The mmap_lock is held so VMAs cannot be altered and no such race can
occur. get_user_pages_remote() requires that the user calls it under the
lock so it is implicitly assured that this cannot happen.

I'll update the description to make this clear on the next spin!

(side-note: here is another irritating issue with GUP, we don't suffix with
_locked() consistently)
diff mbox series

Patch

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index f5bcb0dc6267..74d8d4007dec 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -437,8 +437,9 @@  static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
 		struct page *page = NULL;
 
 		ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
-					    &vma, NULL);
-		if (ret <= 0)
+					    NULL);
+		vma = vma_lookup(mm, addr);
+		if (ret <= 0 || !vma)
 			break;
 
 		/*
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 9250fde1f97d..c19d0cb7d2f2 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -2777,7 +2777,7 @@  static struct page *get_map_page(struct kvm *kvm, u64 uaddr)
 
 	mmap_read_lock(kvm->mm);
 	get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE,
-			      &page, NULL, NULL);
+			      &page, NULL);
 	mmap_read_unlock(kvm->mm);
 	return page;
 }
diff --git a/fs/exec.c b/fs/exec.c
index 87cf3a2f0e9a..d8d48ee15aac 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -219,7 +219,7 @@  static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 	 */
 	mmap_read_lock(bprm->mm);
 	ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
-			&page, NULL, NULL);
+			&page, NULL);
 	mmap_read_unlock(bprm->mm);
 	if (ret <= 0)
 		return NULL;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 513d5fab02f1..8dfa236cfb58 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2374,7 +2374,7 @@  extern int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
 long get_user_pages_remote(struct mm_struct *mm,
 			    unsigned long start, unsigned long nr_pages,
 			    unsigned int gup_flags, struct page **pages,
-			    struct vm_area_struct **vmas, int *locked);
+			    int *locked);
 long pin_user_pages_remote(struct mm_struct *mm,
 			   unsigned long start, unsigned long nr_pages,
 			   unsigned int gup_flags, struct page **pages,
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 59887c69d54c..35e8a7ec884c 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -365,7 +365,6 @@  __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d)
 {
 	void *kaddr;
 	struct page *page;
-	struct vm_area_struct *vma;
 	int ret;
 	short *ptr;
 
@@ -373,7 +372,7 @@  __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d)
 		return -EINVAL;
 
 	ret = get_user_pages_remote(mm, vaddr, 1,
-			FOLL_WRITE, &page, &vma, NULL);
+				    FOLL_WRITE, &page, NULL);
 	if (unlikely(ret <= 0)) {
 		/*
 		 * We are asking for 1 page. If get_user_pages_remote() fails,
@@ -475,8 +474,9 @@  int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
 		gup_flags |= FOLL_SPLIT_PMD;
 	/* Read the page with vaddr into memory */
 	ret = get_user_pages_remote(mm, vaddr, 1, gup_flags,
-				    &old_page, &vma, NULL);
-	if (ret <= 0)
+				    &old_page, NULL);
+	vma = vma_lookup(mm, vaddr);
+	if (ret <= 0 || !vma)
 		return ret;
 
 	ret = verify_opcode(old_page, vaddr, &opcode);
@@ -2028,7 +2028,7 @@  static int is_trap_at_addr(struct mm_struct *mm, unsigned long vaddr)
 	 * essentially a kernel access to the memory.
 	 */
 	result = get_user_pages_remote(mm, vaddr, 1, FOLL_FORCE, &page,
-			NULL, NULL);
+				       NULL);
 	if (result < 0)
 		return result;
 
diff --git a/mm/gup.c b/mm/gup.c
index 931c805bc32b..9440aa54c741 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2165,8 +2165,6 @@  static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
  * @pages:	array that receives pointers to the pages pinned.
  *		Should be at least nr_pages long. Or NULL, if caller
  *		only intends to ensure the pages are faulted in.
- * @vmas:	array of pointers to vmas corresponding to each page.
- *		Or NULL if the caller does not require them.
  * @locked:	pointer to lock flag indicating whether lock is held and
  *		subsequently whether VM_FAULT_RETRY functionality can be
  *		utilised. Lock must initially be held.
@@ -2181,8 +2179,6 @@  static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
  *
  * The caller is responsible for releasing returned @pages, via put_page().
  *
- * @vmas are valid only as long as mmap_lock is held.
- *
  * Must be called with mmap_lock held for read or write.
  *
  * get_user_pages_remote walks a process's page tables and takes a reference
@@ -2219,15 +2215,15 @@  static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
 long get_user_pages_remote(struct mm_struct *mm,
 		unsigned long start, unsigned long nr_pages,
 		unsigned int gup_flags, struct page **pages,
-		struct vm_area_struct **vmas, int *locked)
+		int *locked)
 {
 	int local_locked = 1;
 
-	if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
+	if (!is_valid_gup_args(pages, NULL, locked, &gup_flags,
 			       FOLL_TOUCH | FOLL_REMOTE))
 		return -EINVAL;
 
-	return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
+	return __get_user_pages_locked(mm, start, nr_pages, pages, NULL,
 				       locked ? locked : &local_locked,
 				       gup_flags);
 }
@@ -2237,7 +2233,7 @@  EXPORT_SYMBOL(get_user_pages_remote);
 long get_user_pages_remote(struct mm_struct *mm,
 			   unsigned long start, unsigned long nr_pages,
 			   unsigned int gup_flags, struct page **pages,
-			   struct vm_area_struct **vmas, int *locked)
+			   int *locked)
 {
 	return 0;
 }
diff --git a/mm/memory.c b/mm/memory.c
index ea8fdca35df3..43426147f9f7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5596,7 +5596,11 @@  int __access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf,
 		struct page *page = NULL;
 
 		ret = get_user_pages_remote(mm, addr, 1,
-				gup_flags, &page, &vma, NULL);
+				gup_flags, &page, NULL);
+		vma = vma_lookup(mm, addr);
+		if (!vma)
+			break;
+
 		if (ret <= 0) {
 #ifndef CONFIG_HAVE_IOREMAP_PROT
 			break;
@@ -5605,9 +5609,6 @@  int __access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf,
 			 * Check if this is a VM_IO | VM_PFNMAP VMA, which
 			 * we can access using slightly different code.
 			 */
-			vma = vma_lookup(mm, addr);
-			if (!vma)
-				break;
 			if (vma->vm_ops && vma->vm_ops->access)
 				ret = vma->vm_ops->access(vma, addr, buf,
 							  len, write);
diff --git a/mm/rmap.c b/mm/rmap.c
index ba901c416785..756ea8a9bb90 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2324,7 +2324,7 @@  int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
 
 	npages = get_user_pages_remote(mm, start, npages,
 				       FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
-				       pages, NULL, NULL);
+				       pages, NULL);
 	if (npages < 0)
 		return npages;
 
diff --git a/security/tomoyo/domain.c b/security/tomoyo/domain.c
index 31af29f669d2..ac20c0bdff9d 100644
--- a/security/tomoyo/domain.c
+++ b/security/tomoyo/domain.c
@@ -916,7 +916,7 @@  bool tomoyo_dump_page(struct linux_binprm *bprm, unsigned long pos,
 	 */
 	mmap_read_lock(bprm->mm);
 	ret = get_user_pages_remote(bprm->mm, pos, 1,
-				    FOLL_FORCE, &page, NULL, NULL);
+				    FOLL_FORCE, &page, NULL);
 	mmap_read_unlock(bprm->mm);
 	if (ret <= 0)
 		return false;
diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c
index 9bfe1d6f6529..e033c79d528e 100644
--- a/virt/kvm/async_pf.c
+++ b/virt/kvm/async_pf.c
@@ -61,8 +61,7 @@  static void async_pf_execute(struct work_struct *work)
 	 * access remotely.
 	 */
 	mmap_read_lock(mm);
-	get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, NULL,
-			&locked);
+	get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, &locked);
 	if (locked)
 		mmap_read_unlock(mm);