Message ID | 20220420043734.476348-1-apopple@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm/mmu_notifier.c: Fix race in mmu_interval_notifier_remove() | expand |
On Wed, 20 Apr 2022 14:37:34 +1000 Alistair Popple <apopple@nvidia.com> wrote: > In some cases it is possible for mmu_interval_notifier_remove() to race > with mn_tree_inv_end() allowing it to return while the notifier data > structure is still in use. Consider the following sequence: > > CPU0 - mn_tree_inv_end() CPU1 - mmu_interval_notifier_remove() > ----------------------------------- ------------------------------------ > spin_lock(subscriptions->lock); > seq = subscriptions->invalidate_seq; > spin_lock(subscriptions->lock); spin_unlock(subscriptions->lock); > subscriptions->invalidate_seq++; > wait_event(invalidate_seq != seq); > return; > interval_tree_remove(interval_sub); kfree(interval_sub); > spin_unlock(subscriptions->lock); > wake_up_all(); > > As the wait_event() condition is true it will return immediately. This > can lead to use-after-free type errors if the caller frees the data > structure containing the interval notifier subscription while it is > still on a deferred list. Fix this by taking the appropriate lock when > reading invalidate_seq to ensure proper synchronisation. > > ... > > Fixes: 99cb252f5e68 ("mm/mmu_notifier: add an interval tree notifier") Do you think fix this should be backported into older kernels?
Andrew Morton <akpm@linux-foundation.org> writes: > On Wed, 20 Apr 2022 14:37:34 +1000 Alistair Popple <apopple@nvidia.com> wrote: > >> In some cases it is possible for mmu_interval_notifier_remove() to race >> with mn_tree_inv_end() allowing it to return while the notifier data >> structure is still in use. Consider the following sequence: >> >> CPU0 - mn_tree_inv_end() CPU1 - mmu_interval_notifier_remove() >> ----------------------------------- ------------------------------------ >> spin_lock(subscriptions->lock); >> seq = subscriptions->invalidate_seq; >> spin_lock(subscriptions->lock); spin_unlock(subscriptions->lock); >> subscriptions->invalidate_seq++; >> wait_event(invalidate_seq != seq); >> return; >> interval_tree_remove(interval_sub); kfree(interval_sub); >> spin_unlock(subscriptions->lock); >> wake_up_all(); >> >> As the wait_event() condition is true it will return immediately. This >> can lead to use-after-free type errors if the caller frees the data >> structure containing the interval notifier subscription while it is >> still on a deferred list. Fix this by taking the appropriate lock when >> reading invalidate_seq to ensure proper synchronisation. >> >> ... >> >> Fixes: 99cb252f5e68 ("mm/mmu_notifier: add an interval tree notifier") > > Do you think fix this should be backported into older kernels? Yes, I forgot to cc stable sorry. Do you want me to resend with 'Cc: stable@vger.kernel.org'? - Alistair
On Thu, 21 Apr 2022 09:21:06 +1000 Alistair Popple <apopple@nvidia.com> wrote: > >> As the wait_event() condition is true it will return immediately. This > >> can lead to use-after-free type errors if the caller frees the data > >> structure containing the interval notifier subscription while it is > >> still on a deferred list. Fix this by taking the appropriate lock when > >> reading invalidate_seq to ensure proper synchronisation. > >> > >> ... > >> > >> Fixes: 99cb252f5e68 ("mm/mmu_notifier: add an interval tree notifier") > > > > Do you think fix this should be backported into older kernels? > > Yes, I forgot to cc stable sorry. So we have actually seen these use-after-free errors? Some description of the end-user visible impact is always helpful when deciding which trees need a patch. > Do you want me to resend with > 'Cc: stable@vger.kernel.org'? Thanks, I added that.
Andrew Morton <akpm@linux-foundation.org> writes: > On Thu, 21 Apr 2022 09:21:06 +1000 Alistair Popple <apopple@nvidia.com> wrote: > >> >> As the wait_event() condition is true it will return immediately. This >> >> can lead to use-after-free type errors if the caller frees the data >> >> structure containing the interval notifier subscription while it is >> >> still on a deferred list. Fix this by taking the appropriate lock when >> >> reading invalidate_seq to ensure proper synchronisation. >> >> >> >> ... >> >> >> >> Fixes: 99cb252f5e68 ("mm/mmu_notifier: add an interval tree notifier") >> > >> > Do you think fix this should be backported into older kernels? >> >> Yes, I forgot to cc stable sorry. > > So we have actually seen these use-after-free errors? I observed them whilst running stress testing during some development. You do have to be pretty unlucky, but it lead to the usual problems of use-after-free (memory corruption, kernel crash, difficult to diagnose WARN_ON, etc) so I think it's worth backporting. > Some description of the end-user visible impact is always helpful when > deciding which trees need a patch. > >> Do you want me to resend with >> 'Cc: stable@vger.kernel.org'? > > Thanks, I added that. Thanks.
On Wed, Apr 20, 2022 at 03:11:42PM -0700, Andrew Morton wrote: > On Wed, 20 Apr 2022 14:37:34 +1000 Alistair Popple <apopple@nvidia.com> wrote: > > > In some cases it is possible for mmu_interval_notifier_remove() to race > > with mn_tree_inv_end() allowing it to return while the notifier data > > structure is still in use. Consider the following sequence: > > > > CPU0 - mn_tree_inv_end() CPU1 - mmu_interval_notifier_remove() > > spin_lock(subscriptions->lock); > > seq = subscriptions->invalidate_seq; > > spin_lock(subscriptions->lock); spin_unlock(subscriptions->lock); > > subscriptions->invalidate_seq++; > > wait_event(invalidate_seq != seq); > > return; > > interval_tree_remove(interval_sub); kfree(interval_sub); > > spin_unlock(subscriptions->lock); > > wake_up_all(); > > > > As the wait_event() condition is true it will return immediately. This > > can lead to use-after-free type errors if the caller frees the data > > structure containing the interval notifier subscription while it is > > still on a deferred list. Fix this by taking the appropriate lock when > > reading invalidate_seq to ensure proper synchronisation. > > > > ... > > > > Fixes: 99cb252f5e68 ("mm/mmu_notifier: add an interval tree notifier") > > Do you think fix this should be backported into older kernels? I think it should be tagged stable, yes Jason
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 3f3bbcd298c6..e0275b9f6b81 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -1036,6 +1036,18 @@ int mmu_interval_notifier_insert_locked( } EXPORT_SYMBOL_GPL(mmu_interval_notifier_insert_locked); +static bool +mmu_interval_seq_released(struct mmu_notifier_subscriptions *subscriptions, + unsigned long seq) +{ + bool ret; + + spin_lock(&subscriptions->lock); + ret = subscriptions->invalidate_seq != seq; + spin_unlock(&subscriptions->lock); + return ret; +} + /** * mmu_interval_notifier_remove - Remove a interval notifier * @interval_sub: Interval subscription to unregister @@ -1086,7 +1098,7 @@ void mmu_interval_notifier_remove(struct mmu_interval_notifier *interval_sub) lock_map_release(&__mmu_notifier_invalidate_range_start_map); if (seq) wait_event(subscriptions->wq, - READ_ONCE(subscriptions->invalidate_seq) != seq); + mmu_interval_seq_released(subscriptions, seq)); /* pairs with mmgrab in mmu_interval_notifier_insert() */ mmdrop(mm);