diff mbox series

[26/41] kernel/fork: assert no VMA readers during its destruction

Message ID 20230109205336.3665937-27-surenb@google.com (mailing list archive)
State New
Headers show
Series Per-VMA locks | expand

Commit Message

Suren Baghdasaryan Jan. 9, 2023, 8:53 p.m. UTC
Assert there are no holders of VMA lock for reading when it is about to be
destroyed.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 include/linux/mm.h | 8 ++++++++
 kernel/fork.c      | 2 ++
 2 files changed, 10 insertions(+)

Comments

Michal Hocko Jan. 17, 2023, 3:42 p.m. UTC | #1
On Mon 09-01-23 12:53:21, Suren Baghdasaryan wrote:
> Assert there are no holders of VMA lock for reading when it is about to be
> destroyed.
> 
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  include/linux/mm.h | 8 ++++++++
>  kernel/fork.c      | 2 ++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 594e835bad9c..c464fc8a514c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -680,6 +680,13 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
>  	VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma);
>  }
>  
> +static inline void vma_assert_no_reader(struct vm_area_struct *vma)
> +{
> +	VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock) &&
> +		      vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq),
> +		      vma);

Do we really need to check for vm_lock_seq? rwsem_is_locked should tell
us something is wrong on its own, no? This could be somebody racing with
the vma destruction and using the write lock. Unlikely but I do not see
why to narrow debugging scope.
Suren Baghdasaryan Jan. 18, 2023, 1:53 a.m. UTC | #2
On Tue, Jan 17, 2023 at 7:42 AM 'Michal Hocko' via kernel-team
<kernel-team@android.com> wrote:
>
> On Mon 09-01-23 12:53:21, Suren Baghdasaryan wrote:
> > Assert there are no holders of VMA lock for reading when it is about to be
> > destroyed.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > ---
> >  include/linux/mm.h | 8 ++++++++
> >  kernel/fork.c      | 2 ++
> >  2 files changed, 10 insertions(+)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 594e835bad9c..c464fc8a514c 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -680,6 +680,13 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> >       VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma);
> >  }
> >
> > +static inline void vma_assert_no_reader(struct vm_area_struct *vma)
> > +{
> > +     VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock) &&
> > +                   vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq),
> > +                   vma);
>
> Do we really need to check for vm_lock_seq? rwsem_is_locked should tell
> us something is wrong on its own, no? This could be somebody racing with
> the vma destruction and using the write lock. Unlikely but I do not see
> why to narrow debugging scope.

I wanted to ensure there are no page fault handlers (read-lockers)
when we are destroying the VMA and rwsem_is_locked(&vma->lock) alone
could trigger if someone is concurrently calling vma_write_lock(). But
I don't think we expect someone to be write-locking the VMA while we
are destroying it, so you are right, I'm overcomplicating things here.
I think I can get rid of vma_assert_no_reader() and add
VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock)) directly in
__vm_area_free(). WDYT?


> --
> Michal Hocko
> SUSE Labs
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
Michal Hocko Jan. 18, 2023, 9:43 a.m. UTC | #3
On Tue 17-01-23 17:53:00, Suren Baghdasaryan wrote:
> On Tue, Jan 17, 2023 at 7:42 AM 'Michal Hocko' via kernel-team
> <kernel-team@android.com> wrote:
> >
> > On Mon 09-01-23 12:53:21, Suren Baghdasaryan wrote:
> > > Assert there are no holders of VMA lock for reading when it is about to be
> > > destroyed.
> > >
> > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > ---
> > >  include/linux/mm.h | 8 ++++++++
> > >  kernel/fork.c      | 2 ++
> > >  2 files changed, 10 insertions(+)
> > >
> > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > index 594e835bad9c..c464fc8a514c 100644
> > > --- a/include/linux/mm.h
> > > +++ b/include/linux/mm.h
> > > @@ -680,6 +680,13 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> > >       VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma);
> > >  }
> > >
> > > +static inline void vma_assert_no_reader(struct vm_area_struct *vma)
> > > +{
> > > +     VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock) &&
> > > +                   vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq),
> > > +                   vma);
> >
> > Do we really need to check for vm_lock_seq? rwsem_is_locked should tell
> > us something is wrong on its own, no? This could be somebody racing with
> > the vma destruction and using the write lock. Unlikely but I do not see
> > why to narrow debugging scope.
> 
> I wanted to ensure there are no page fault handlers (read-lockers)
> when we are destroying the VMA and rwsem_is_locked(&vma->lock) alone
> could trigger if someone is concurrently calling vma_write_lock(). But
> I don't think we expect someone to be write-locking the VMA while we

That would be UAF, no?

> are destroying it, so you are right, I'm overcomplicating things here.
> I think I can get rid of vma_assert_no_reader() and add
> VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock)) directly in
> __vm_area_free(). WDYT?

Yes, that adds some debugging. Not sure it is really necessary buyt it
is VM_BUG_ON so why not.
Suren Baghdasaryan Jan. 18, 2023, 6:06 p.m. UTC | #4
On Wed, Jan 18, 2023 at 1:43 AM 'Michal Hocko' via kernel-team
<kernel-team@android.com> wrote:
>
> On Tue 17-01-23 17:53:00, Suren Baghdasaryan wrote:
> > On Tue, Jan 17, 2023 at 7:42 AM 'Michal Hocko' via kernel-team
> > <kernel-team@android.com> wrote:
> > >
> > > On Mon 09-01-23 12:53:21, Suren Baghdasaryan wrote:
> > > > Assert there are no holders of VMA lock for reading when it is about to be
> > > > destroyed.
> > > >
> > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > > ---
> > > >  include/linux/mm.h | 8 ++++++++
> > > >  kernel/fork.c      | 2 ++
> > > >  2 files changed, 10 insertions(+)
> > > >
> > > > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > > > index 594e835bad9c..c464fc8a514c 100644
> > > > --- a/include/linux/mm.h
> > > > +++ b/include/linux/mm.h
> > > > @@ -680,6 +680,13 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
> > > >       VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma);
> > > >  }
> > > >
> > > > +static inline void vma_assert_no_reader(struct vm_area_struct *vma)
> > > > +{
> > > > +     VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock) &&
> > > > +                   vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq),
> > > > +                   vma);
> > >
> > > Do we really need to check for vm_lock_seq? rwsem_is_locked should tell
> > > us something is wrong on its own, no? This could be somebody racing with
> > > the vma destruction and using the write lock. Unlikely but I do not see
> > > why to narrow debugging scope.
> >
> > I wanted to ensure there are no page fault handlers (read-lockers)
> > when we are destroying the VMA and rwsem_is_locked(&vma->lock) alone
> > could trigger if someone is concurrently calling vma_write_lock(). But
> > I don't think we expect someone to be write-locking the VMA while we
>
> That would be UAF, no?

Yes. That's why what I have is an overkill (which is also racy).

>
> > are destroying it, so you are right, I'm overcomplicating things here.
> > I think I can get rid of vma_assert_no_reader() and add
> > VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock)) directly in
> > __vm_area_free(). WDYT?
>
> Yes, that adds some debugging. Not sure it is really necessary buyt it
> is VM_BUG_ON so why not.

I would like to keep it if possible. If it triggers that would be a
clear signal what the issue is. Otherwise it might be hard to debug
such a corner case.

> --
> Michal Hocko
> SUSE Labs
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 594e835bad9c..c464fc8a514c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -680,6 +680,13 @@  static inline void vma_assert_write_locked(struct vm_area_struct *vma)
 	VM_BUG_ON_VMA(vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq), vma);
 }
 
+static inline void vma_assert_no_reader(struct vm_area_struct *vma)
+{
+	VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock) &&
+		      vma->vm_lock_seq != READ_ONCE(vma->vm_mm->mm_lock_seq),
+		      vma);
+}
+
 #else /* CONFIG_PER_VMA_LOCK */
 
 static inline void vma_init_lock(struct vm_area_struct *vma) {}
@@ -688,6 +695,7 @@  static inline bool vma_read_trylock(struct vm_area_struct *vma)
 		{ return false; }
 static inline void vma_read_unlock(struct vm_area_struct *vma) {}
 static inline void vma_assert_write_locked(struct vm_area_struct *vma) {}
+static inline void vma_assert_no_reader(struct vm_area_struct *vma) {}
 
 #endif /* CONFIG_PER_VMA_LOCK */
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 1591dd8a0745..6d9f14e55ecf 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -485,6 +485,8 @@  static void __vm_area_free(struct rcu_head *head)
 {
 	struct vm_area_struct *vma = container_of(head, struct vm_area_struct,
 						  vm_rcu);
+	/* The vma should either have no lock holders or be write-locked. */
+	vma_assert_no_reader(vma);
 	kmem_cache_free(vm_area_cachep, vma);
 }
 #endif