diff mbox series

[1/2] mm/uffd: Fix vma operation where start addr cuts part of vma

Message ID 20230517150408.3411044-2-peterx@redhat.com (mailing list archive)
State New
Headers show
Series mm/uffd: Fix vma merge/split | expand

Commit Message

Peter Xu May 17, 2023, 3:04 p.m. UTC
It seems vma merging with uffd paths is broken with either
register/unregister, where right now we can feed wrong parameters to
vma_merge() and it's found by recent patch which moved asserts upwards in
vma_merge() by Lorenzo Stoakes:

https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/

The problem is in the current code base we didn't fixup "prev" for the case
where "start" address can be within the "prev" vma section.  In that case
we should have "prev" points to the current vma rather than the previous
one when feeding to vma_merge().

This patch will eliminate the report and make sure vma_merge() calls will
become legal again.

One thing to mention is that the "Fixes: 29417d292bd0" below is there only
to help explain where the warning can start to trigger, the real commit to
fix should be 69dbe6daf104.  Commit 29417d292bd0 helps us to identify the
issue, but unfortunately we may want to keep it in Fixes too just to ease
kernel backporters for easier tracking.

Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 29417d292bd0 ("mm/mmap/vma_merge: always check invariants")
Fixes: 69dbe6daf104 ("userfaultfd: use maple tree iterator to iterate VMAs")
Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 fs/userfaultfd.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Lorenzo Stoakes May 17, 2023, 5:20 p.m. UTC | #1
On Wed, May 17, 2023 at 11:04:07AM -0400, Peter Xu wrote:
> It seems vma merging with uffd paths is broken with either
> register/unregister, where right now we can feed wrong parameters to
> vma_merge() and it's found by recent patch which moved asserts upwards in
> vma_merge() by Lorenzo Stoakes:
>
> https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
>
> The problem is in the current code base we didn't fixup "prev" for the case
> where "start" address can be within the "prev" vma section.  In that case
> we should have "prev" points to the current vma rather than the previous
> one when feeding to vma_merge().

This doesn't seem quite correct, perhaps - "where start is contained within vma
but not clamped to its start. We need to convert this into case 4 which permits
subdivision of prev by assigning vma to prev. As we loop, each subsequent VMA
will be clamped to the start."

>
> This patch will eliminate the report and make sure vma_merge() calls will
> become legal again.
>
> One thing to mention is that the "Fixes: 29417d292bd0" below is there only
> to help explain where the warning can start to trigger, the real commit to
> fix should be 69dbe6daf104.  Commit 29417d292bd0 helps us to identify the
> issue, but unfortunately we may want to keep it in Fixes too just to ease
> kernel backporters for easier tracking.
>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Mike Rapoport (IBM) <rppt@kernel.org>
> Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
> Reported-by: Mark Rutland <mark.rutland@arm.com>
> Fixes: 29417d292bd0 ("mm/mmap/vma_merge: always check invariants")
> Fixes: 69dbe6daf104 ("userfaultfd: use maple tree iterator to iterate VMAs")
> Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> Cc: linux-stable <stable@vger.kernel.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  fs/userfaultfd.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 0fd96d6e39ce..17c8c345dac4 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -1459,6 +1459,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
>
>  	vma_iter_set(&vmi, start);
>  	prev = vma_prev(&vmi);
> +	if (vma->vm_start < start)
> +		prev = vma;
>
>  	ret = 0;
>  	for_each_vma_range(vmi, vma, end) {
> @@ -1625,6 +1627,9 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
>
>  	vma_iter_set(&vmi, start);
>  	prev = vma_prev(&vmi);
> +	if (vma->vm_start < start)
> +		prev = vma;
> +
>  	ret = 0;
>  	for_each_vma_range(vmi, vma, end) {
>  		cond_resched();
> --
> 2.39.1
>

Other than that looks good:-

Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Liam R. Howlett May 17, 2023, 6:01 p.m. UTC | #2
* Peter Xu <peterx@redhat.com> [230517 11:04]:
> It seems vma merging with uffd paths is broken with either
> register/unregister, where right now we can feed wrong parameters to
> vma_merge() and it's found by recent patch which moved asserts upwards in
> vma_merge() by Lorenzo Stoakes:
> 
> https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> 
> The problem is in the current code base we didn't fixup "prev" for the case
> where "start" address can be within the "prev" vma section.  In that case
> we should have "prev" points to the current vma rather than the previous
> one when feeding to vma_merge().
> 
> This patch will eliminate the report and make sure vma_merge() calls will
> become legal again.
> 
> One thing to mention is that the "Fixes: 29417d292bd0" below is there only
> to help explain where the warning can start to trigger, the real commit to
> fix should be 69dbe6daf104.  Commit 29417d292bd0 helps us to identify the
> issue, but unfortunately we may want to keep it in Fixes too just to ease
> kernel backporters for easier tracking.
> 
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Mike Rapoport (IBM) <rppt@kernel.org>
> Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
> Reported-by: Mark Rutland <mark.rutland@arm.com>
> Fixes: 29417d292bd0 ("mm/mmap/vma_merge: always check invariants")
> Fixes: 69dbe6daf104 ("userfaultfd: use maple tree iterator to iterate VMAs")
> Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> Cc: linux-stable <stable@vger.kernel.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  fs/userfaultfd.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 0fd96d6e39ce..17c8c345dac4 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -1459,6 +1459,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
>  
>  	vma_iter_set(&vmi, start);
>  	prev = vma_prev(&vmi);
> +	if (vma->vm_start < start)
> +		prev = vma;
>  
>  	ret = 0;
>  	for_each_vma_range(vmi, vma, end) {
> @@ -1625,6 +1627,9 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
>  
>  	vma_iter_set(&vmi, start);
>  	prev = vma_prev(&vmi);
> +	if (vma->vm_start < start)
> +		prev = vma;
> +
>  	ret = 0;
>  	for_each_vma_range(vmi, vma, end) {
>  		cond_resched();
> -- 
> 2.39.1
>
Peter Xu May 17, 2023, 6:37 p.m. UTC | #3
On Wed, May 17, 2023 at 06:20:55PM +0100, Lorenzo Stoakes wrote:
> On Wed, May 17, 2023 at 11:04:07AM -0400, Peter Xu wrote:
> > It seems vma merging with uffd paths is broken with either
> > register/unregister, where right now we can feed wrong parameters to
> > vma_merge() and it's found by recent patch which moved asserts upwards in
> > vma_merge() by Lorenzo Stoakes:
> >
> > https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> >
> > The problem is in the current code base we didn't fixup "prev" for the case
> > where "start" address can be within the "prev" vma section.  In that case
> > we should have "prev" points to the current vma rather than the previous
> > one when feeding to vma_merge().
> 
> This doesn't seem quite correct, perhaps - "where start is contained within vma
> but not clamped to its start. We need to convert this into case 4 which permits
> subdivision of prev by assigning vma to prev. As we loop, each subsequent VMA
> will be clamped to the start."

I think it covers more than case 4 - it can also be case 0 where no merge
will happen?

> 
> >
> > This patch will eliminate the report and make sure vma_merge() calls will
> > become legal again.
> >
> > One thing to mention is that the "Fixes: 29417d292bd0" below is there only
> > to help explain where the warning can start to trigger, the real commit to
> > fix should be 69dbe6daf104.  Commit 29417d292bd0 helps us to identify the
> > issue, but unfortunately we may want to keep it in Fixes too just to ease
> > kernel backporters for easier tracking.
> >
> > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > Cc: Mike Rapoport (IBM) <rppt@kernel.org>
> > Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
> > Reported-by: Mark Rutland <mark.rutland@arm.com>
> > Fixes: 29417d292bd0 ("mm/mmap/vma_merge: always check invariants")
> > Fixes: 69dbe6daf104 ("userfaultfd: use maple tree iterator to iterate VMAs")
> > Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> > Cc: linux-stable <stable@vger.kernel.org>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  fs/userfaultfd.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > index 0fd96d6e39ce..17c8c345dac4 100644
> > --- a/fs/userfaultfd.c
> > +++ b/fs/userfaultfd.c
> > @@ -1459,6 +1459,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> >
> >  	vma_iter_set(&vmi, start);
> >  	prev = vma_prev(&vmi);
> > +	if (vma->vm_start < start)
> > +		prev = vma;
> >
> >  	ret = 0;
> >  	for_each_vma_range(vmi, vma, end) {
> > @@ -1625,6 +1627,9 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
> >
> >  	vma_iter_set(&vmi, start);
> >  	prev = vma_prev(&vmi);
> > +	if (vma->vm_start < start)
> > +		prev = vma;
> > +
> >  	ret = 0;
> >  	for_each_vma_range(vmi, vma, end) {
> >  		cond_resched();
> > --
> > 2.39.1
> >
> 
> Other than that looks good:-
> 
> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>

Thanks to both on the quick reviews!
Lorenzo Stoakes May 17, 2023, 6:40 p.m. UTC | #4
On Wed, May 17, 2023 at 02:37:41PM -0400, Peter Xu wrote:
> On Wed, May 17, 2023 at 06:20:55PM +0100, Lorenzo Stoakes wrote:
> > On Wed, May 17, 2023 at 11:04:07AM -0400, Peter Xu wrote:
> > > It seems vma merging with uffd paths is broken with either
> > > register/unregister, where right now we can feed wrong parameters to
> > > vma_merge() and it's found by recent patch which moved asserts upwards in
> > > vma_merge() by Lorenzo Stoakes:
> > >
> > > https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> > >
> > > The problem is in the current code base we didn't fixup "prev" for the case
> > > where "start" address can be within the "prev" vma section.  In that case
> > > we should have "prev" points to the current vma rather than the previous
> > > one when feeding to vma_merge().
> >
> > This doesn't seem quite correct, perhaps - "where start is contained within vma
> > but not clamped to its start. We need to convert this into case 4 which permits
> > subdivision of prev by assigning vma to prev. As we loop, each subsequent VMA
> > will be clamped to the start."
>
> I think it covers more than case 4 - it can also be case 0 where no merge
> will happen?

Ugh please let's not call a case that doesn't merge by a number :P but sure of
course it might also not merge.

>
> >
> > >
> > > This patch will eliminate the report and make sure vma_merge() calls will
> > > become legal again.
> > >
> > > One thing to mention is that the "Fixes: 29417d292bd0" below is there only
> > > to help explain where the warning can start to trigger, the real commit to
> > > fix should be 69dbe6daf104.  Commit 29417d292bd0 helps us to identify the
> > > issue, but unfortunately we may want to keep it in Fixes too just to ease
> > > kernel backporters for easier tracking.
> > >
> > > Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> > > Cc: Mike Rapoport (IBM) <rppt@kernel.org>
> > > Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
> > > Reported-by: Mark Rutland <mark.rutland@arm.com>
> > > Fixes: 29417d292bd0 ("mm/mmap/vma_merge: always check invariants")
> > > Fixes: 69dbe6daf104 ("userfaultfd: use maple tree iterator to iterate VMAs")
> > > Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> > > Cc: linux-stable <stable@vger.kernel.org>
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  fs/userfaultfd.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > > index 0fd96d6e39ce..17c8c345dac4 100644
> > > --- a/fs/userfaultfd.c
> > > +++ b/fs/userfaultfd.c
> > > @@ -1459,6 +1459,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> > >
> > >  	vma_iter_set(&vmi, start);
> > >  	prev = vma_prev(&vmi);
> > > +	if (vma->vm_start < start)
> > > +		prev = vma;
> > >
> > >  	ret = 0;
> > >  	for_each_vma_range(vmi, vma, end) {
> > > @@ -1625,6 +1627,9 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
> > >
> > >  	vma_iter_set(&vmi, start);
> > >  	prev = vma_prev(&vmi);
> > > +	if (vma->vm_start < start)
> > > +		prev = vma;
> > > +
> > >  	ret = 0;
> > >  	for_each_vma_range(vmi, vma, end) {
> > >  		cond_resched();
> > > --
> > > 2.39.1
> > >
> >
> > Other than that looks good:-
> >
> > Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
>
> Thanks to both on the quick reviews!

No problem!

>
> --
> Peter Xu
>
Peter Xu May 17, 2023, 6:54 p.m. UTC | #5
On Wed, May 17, 2023 at 07:40:59PM +0100, Lorenzo Stoakes wrote:
> On Wed, May 17, 2023 at 02:37:41PM -0400, Peter Xu wrote:
> > On Wed, May 17, 2023 at 06:20:55PM +0100, Lorenzo Stoakes wrote:
> > > On Wed, May 17, 2023 at 11:04:07AM -0400, Peter Xu wrote:
> > > > It seems vma merging with uffd paths is broken with either
> > > > register/unregister, where right now we can feed wrong parameters to
> > > > vma_merge() and it's found by recent patch which moved asserts upwards in
> > > > vma_merge() by Lorenzo Stoakes:
> > > >
> > > > https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> > > >
> > > > The problem is in the current code base we didn't fixup "prev" for the case
> > > > where "start" address can be within the "prev" vma section.  In that case
> > > > we should have "prev" points to the current vma rather than the previous
> > > > one when feeding to vma_merge().
> > >
> > > This doesn't seem quite correct, perhaps - "where start is contained within vma
> > > but not clamped to its start. We need to convert this into case 4 which permits
> > > subdivision of prev by assigning vma to prev. As we loop, each subsequent VMA
> > > will be clamped to the start."
> >
> > I think it covers more than case 4 - it can also be case 0 where no merge
> > will happen?
> 
> Ugh please let's not call a case that doesn't merge by a number :P but sure of
> course it might also not merge.

To me the original paragraph was still fine. But if you prefer your version
(which I'm perfectly fine either way if you'd like to spell out what cases
it'll trigger), it'll be:

  It's possible that "start" is contained within vma but not clamped to its
  start.  We need to convert this into either "cannot merge" case or "can
  merge" case 4 which permits subdivision of prev by assigning vma to
  prev. As we loop, each subsequent VMA will be clamped to the start.

Does that look good to you?

Thanks,
Lorenzo Stoakes May 17, 2023, 7:03 p.m. UTC | #6
On Wed, May 17, 2023 at 02:54:39PM -0400, Peter Xu wrote:
> On Wed, May 17, 2023 at 07:40:59PM +0100, Lorenzo Stoakes wrote:
> > On Wed, May 17, 2023 at 02:37:41PM -0400, Peter Xu wrote:
> > > On Wed, May 17, 2023 at 06:20:55PM +0100, Lorenzo Stoakes wrote:
> > > > On Wed, May 17, 2023 at 11:04:07AM -0400, Peter Xu wrote:
> > > > > It seems vma merging with uffd paths is broken with either
> > > > > register/unregister, where right now we can feed wrong parameters to
> > > > > vma_merge() and it's found by recent patch which moved asserts upwards in
> > > > > vma_merge() by Lorenzo Stoakes:
> > > > >
> > > > > https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
> > > > >
> > > > > The problem is in the current code base we didn't fixup "prev" for the case
> > > > > where "start" address can be within the "prev" vma section.  In that case
> > > > > we should have "prev" points to the current vma rather than the previous
> > > > > one when feeding to vma_merge().
> > > >
> > > > This doesn't seem quite correct, perhaps - "where start is contained within vma
> > > > but not clamped to its start. We need to convert this into case 4 which permits
> > > > subdivision of prev by assigning vma to prev. As we loop, each subsequent VMA
> > > > will be clamped to the start."
> > >
> > > I think it covers more than case 4 - it can also be case 0 where no merge
> > > will happen?
> >
> > Ugh please let's not call a case that doesn't merge by a number :P but sure of
> > course it might also not merge.
>
> To me the original paragraph was still fine. But if you prefer your version
> (which I'm perfectly fine either way if you'd like to spell out what cases
> it'll trigger), it'll be:
>
>   It's possible that "start" is contained within vma but not clamped to its
>   start.  We need to convert this into either "cannot merge" case or "can
>   merge" case 4 which permits subdivision of prev by assigning vma to
>   prev. As we loop, each subsequent VMA will be clamped to the start.
>
> Does that look good to you?
>

Looks good to me, thanks for taking the time!

> Thanks,
>
> --
> Peter Xu
>
diff mbox series

Patch

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 0fd96d6e39ce..17c8c345dac4 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1459,6 +1459,8 @@  static int userfaultfd_register(struct userfaultfd_ctx *ctx,
 
 	vma_iter_set(&vmi, start);
 	prev = vma_prev(&vmi);
+	if (vma->vm_start < start)
+		prev = vma;
 
 	ret = 0;
 	for_each_vma_range(vmi, vma, end) {
@@ -1625,6 +1627,9 @@  static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
 
 	vma_iter_set(&vmi, start);
 	prev = vma_prev(&vmi);
+	if (vma->vm_start < start)
+		prev = vma;
+
 	ret = 0;
 	for_each_vma_range(vmi, vma, end) {
 		cond_resched();