Message ID | 1544340077-11491-1-git-send-email-gchen.guomin@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix mm->owner point to a task that does not exists | expand |
On Sun, Dec 09, 2018 at 03:21:17PM +0800, gchen.guomin@gmail.com wrote: > From: guominchen <guominchen@tencent.com> > > Under normal circumstances,When do_exit exits, mm->owner will > be updated, but when the kernel process calls unuse_mm and exits, > mm->owner cannot be updated. And will point to a task that has > been released. > > Below is my issue on vhost_net: > A, B are two kernel processes(such as vhost_worker), > C is a user space process(such as qemu), and all > three use the mm of the user process C. > Now, because user process C exits abnormally, the owner of this > mm becomes A. When A calls unuse_mm and exits, this mm->ower > still points to the A that has been released. > When B accesses this mm->owner again, A has been released. > > Process A Process B > vhost_worker() vhost_worker() > --------- --------- > use_mm() use_mm() > ... > unuse_mm() > tsk->mm=NULL > do_exit() page fault > exit_mm() access mm->owner > can't update owner kernel Oops > > unuse_mm() > > Cc: <linux-mm@kvack.org> > Cc: <linux-kernel@vger.kernel.org> > Cc: "Michael S. Tsirkin" <mst@redhat.com> > Cc: Jason Wang <jasowang@redhat.com> > Cc: <netdev@vger.kernel.org> > Signed-off-by: guominchen <guominchen@tencent.com> > --- > mm/mmu_context.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/mm/mmu_context.c b/mm/mmu_context.c > index 3e612ae..185bb23 100644 > --- a/mm/mmu_context.c > +++ b/mm/mmu_context.c > @@ -56,7 +56,6 @@ void unuse_mm(struct mm_struct *mm) > > task_lock(tsk); > sync_mm_rss(mm); > - tsk->mm = NULL; > /* active_mm is still 'mm' */ > enter_lazy_tlb(mm, tsk); > task_unlock(tsk); So that will work for vhost because we never drop the mm reference before destroying the task. I wonder whether that's true for other users though. It would seem cleaner to onvoke some callback so tasks such as vhost can drop the reference. And looking at all this code, I don't understand why is mm->owner safe to change like this: mm->owner = NULL; when users seem to use it under RCU. > -- > 1.8.3.1
>> From: guominchen <guominchen@tencent.com> >> >> Under normal circumstances,When do_exit exits, mm->owner will >> be updated, but when the kernel process calls unuse_mm and exits, >> mm->owner cannot be updated. And will point to a task that has >> been released. >> >> Below is my issue on vhost_net: >> A, B are two kernel processes(such as vhost_worker), >> C is a user space process(such as qemu), and all >> three use the mm of the user process C. >> Now, because user process C exits abnormally, the owner of this >> mm becomes A. When A calls unuse_mm and exits, this mm->ower >> still points to the A that has been released. >> When B accesses this mm->owner again, A has been released. >> >> Process A Process B >> vhost_worker() vhost_worker() >> --------- --------- >> use_mm() use_mm() >> ... >> unuse_mm() >> tsk->mm=NULL >> do_exit() page fault >> exit_mm() access mm->owner >> can't update owner kernel Oops >> >> unuse_mm() >> >> Cc: <linux-mm@kvack.org> >> Cc: <linux-kernel@vger.kernel.org> >> Cc: "Michael S. Tsirkin" <mst@redhat.com> >> Cc: Jason Wang <jasowang@redhat.com> >> Cc: <netdev@vger.kernel.org> >> Signed-off-by: guominchen <guominchen@tencent.com> >> --- >> mm/mmu_context.c | 1 - >> 1 file changed, 1 deletion(-) >> >> diff --git a/mm/mmu_context.c b/mm/mmu_context.c index >> 3e612ae..185bb23 100644 >> --- a/mm/mmu_context.c >> +++ b/mm/mmu_context.c >> @@ -56,7 +56,6 @@ void unuse_mm(struct mm_struct *mm) >> >> task_lock(tsk); >> sync_mm_rss(mm); >> - tsk->mm = NULL; >> /* active_mm is still 'mm' */ >> enter_lazy_tlb(mm, tsk); >> task_unlock(tsk); >So that will work for vhost because we never drop the mm reference before destroying the task. >I wonder whether that's true for other users though. >It would seem cleaner to onvoke some callback so tasks such as vhost can drop the reference. Yes, I can remove this call in vhost, but I think use_mm(), and unuse_mm() are called in pairs in order to share mm. And exit_mm() as a unified mm handler, it doing very well, So we should leave mm to exit_mm() to handle it. >And looking at all this code, I don't understand why is mm->owner safe to change like this: > mm->owner = NULL; >when users seem to use it under RCU. I think that mm->owner=NULL just changes the value of the pointer, and the task_struct it points to is present and not released. >> -- >> 1.8.3.1
diff --git a/mm/mmu_context.c b/mm/mmu_context.c index 3e612ae..185bb23 100644 --- a/mm/mmu_context.c +++ b/mm/mmu_context.c @@ -56,7 +56,6 @@ void unuse_mm(struct mm_struct *mm) task_lock(tsk); sync_mm_rss(mm); - tsk->mm = NULL; /* active_mm is still 'mm' */ enter_lazy_tlb(mm, tsk); task_unlock(tsk);