Message ID | 20250305200403.2822855-1-ziy@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3] mm/migrate: fix shmem xarray update during migration | expand |
On Wed, Mar 05, 2025 at 03:04:03PM -0500, Zi Yan wrote: > A shmem folio can be either in page cache or in swap cache, but not at the > same time. Namely, once it is in swap cache, folio->mapping should be NULL, > and the folio is no longer in a shmem mapping. > > In __folio_migrate_mapping(), to determine the number of xarray entries > to update, folio_test_swapbacked() is used, but that conflates shmem in > page cache case and shmem in swap cache case. It leads to xarray > multi-index entry corruption, since it turns a sibling entry to a > normal entry during xas_store() (see [1] for a userspace reproduction). > Fix it by only using folio_test_swapcache() to determine whether xarray > is storing swap cache entries or not to choose the right number of xarray > entries to update. > > [1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/ > > Note: > In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used > to get swap_cache address space, but that ignores the shmem folio in swap > cache case. It could lead to NULL pointer dereferencing when a > in-swap-cache shmem folio is split at __xa_store(), since > !folio_test_anon() is true and folio->mapping is NULL. But fortunately, > its caller split_huge_page_to_list_to_order() bails out early with EBUSY > when folio->mapping is NULL. So no need to take care of it here. > > Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly") > Reported-by: Liu Shixin <liushixin2@huawei.com> > Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/ > Suggested-by: Hugh Dickins <hughd@google.com> > Signed-off-by: Zi Yan <ziy@nvidia.com> > Cc: stable@vger.kernel.org Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
On 2025/3/6 04:04, Zi Yan wrote: > A shmem folio can be either in page cache or in swap cache, but not at the > same time. Namely, once it is in swap cache, folio->mapping should be NULL, > and the folio is no longer in a shmem mapping. > > In __folio_migrate_mapping(), to determine the number of xarray entries > to update, folio_test_swapbacked() is used, but that conflates shmem in > page cache case and shmem in swap cache case. It leads to xarray > multi-index entry corruption, since it turns a sibling entry to a > normal entry during xas_store() (see [1] for a userspace reproduction). > Fix it by only using folio_test_swapcache() to determine whether xarray > is storing swap cache entries or not to choose the right number of xarray > entries to update. > > [1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/ > > Note: > In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used > to get swap_cache address space, but that ignores the shmem folio in swap > cache case. It could lead to NULL pointer dereferencing when a > in-swap-cache shmem folio is split at __xa_store(), since > !folio_test_anon() is true and folio->mapping is NULL. But fortunately, > its caller split_huge_page_to_list_to_order() bails out early with EBUSY > when folio->mapping is NULL. So no need to take care of it here. > > Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly") > Reported-by: Liu Shixin <liushixin2@huawei.com> > Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/ > Suggested-by: Hugh Dickins <hughd@google.com> > Signed-off-by: Zi Yan <ziy@nvidia.com> > Cc: stable@vger.kernel.org Thanks for fixing the issue. Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/migrate.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index fb4afd31baf0..c0adea67cd62 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -518,15 +518,13 @@ static int __folio_migrate_mapping(struct address_space *mapping, > if (folio_test_anon(folio) && folio_test_large(folio)) > mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1); > folio_ref_add(newfolio, nr); /* add cache reference */ > - if (folio_test_swapbacked(folio)) { > + if (folio_test_swapbacked(folio)) > __folio_set_swapbacked(newfolio); > - if (folio_test_swapcache(folio)) { > - folio_set_swapcache(newfolio); > - newfolio->private = folio_get_private(folio); > - } > + if (folio_test_swapcache(folio)) { > + folio_set_swapcache(newfolio); > + newfolio->private = folio_get_private(folio); > entries = nr; > } else { > - VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio); > entries = 1; > } >
On 2025/3/6 4:04, Zi Yan wrote: > A shmem folio can be either in page cache or in swap cache, but not at the > same time. Namely, once it is in swap cache, folio->mapping should be NULL, > and the folio is no longer in a shmem mapping. > > In __folio_migrate_mapping(), to determine the number of xarray entries > to update, folio_test_swapbacked() is used, but that conflates shmem in > page cache case and shmem in swap cache case. It leads to xarray > multi-index entry corruption, since it turns a sibling entry to a > normal entry during xas_store() (see [1] for a userspace reproduction). > Fix it by only using folio_test_swapcache() to determine whether xarray > is storing swap cache entries or not to choose the right number of xarray > entries to update. > > [1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/ > > Note: > In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used > to get swap_cache address space, but that ignores the shmem folio in swap > cache case. It could lead to NULL pointer dereferencing when a > in-swap-cache shmem folio is split at __xa_store(), since > !folio_test_anon() is true and folio->mapping is NULL. But fortunately, > its caller split_huge_page_to_list_to_order() bails out early with EBUSY > when folio->mapping is NULL. So no need to take care of it here. > > Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly") > Reported-by: Liu Shixin <liushixin2@huawei.com> > Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/ > Suggested-by: Hugh Dickins <hughd@google.com> > Signed-off-by: Zi Yan <ziy@nvidia.com> > Cc: stable@vger.kernel.org Thanks for the patch, it works for me. > --- > mm/migrate.c | 10 ++++------ > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index fb4afd31baf0..c0adea67cd62 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -518,15 +518,13 @@ static int __folio_migrate_mapping(struct address_space *mapping, > if (folio_test_anon(folio) && folio_test_large(folio)) > mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1); > folio_ref_add(newfolio, nr); /* add cache reference */ > - if (folio_test_swapbacked(folio)) { > + if (folio_test_swapbacked(folio)) > __folio_set_swapbacked(newfolio); > - if (folio_test_swapcache(folio)) { > - folio_set_swapcache(newfolio); > - newfolio->private = folio_get_private(folio); > - } > + if (folio_test_swapcache(folio)) { > + folio_set_swapcache(newfolio); > + newfolio->private = folio_get_private(folio); > entries = nr; > } else { > - VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio); > entries = 1; > } >
diff --git a/mm/migrate.c b/mm/migrate.c index fb4afd31baf0..c0adea67cd62 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -518,15 +518,13 @@ static int __folio_migrate_mapping(struct address_space *mapping, if (folio_test_anon(folio) && folio_test_large(folio)) mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1); folio_ref_add(newfolio, nr); /* add cache reference */ - if (folio_test_swapbacked(folio)) { + if (folio_test_swapbacked(folio)) __folio_set_swapbacked(newfolio); - if (folio_test_swapcache(folio)) { - folio_set_swapcache(newfolio); - newfolio->private = folio_get_private(folio); - } + if (folio_test_swapcache(folio)) { + folio_set_swapcache(newfolio); + newfolio->private = folio_get_private(folio); entries = nr; } else { - VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio); entries = 1; }
A shmem folio can be either in page cache or in swap cache, but not at the same time. Namely, once it is in swap cache, folio->mapping should be NULL, and the folio is no longer in a shmem mapping. In __folio_migrate_mapping(), to determine the number of xarray entries to update, folio_test_swapbacked() is used, but that conflates shmem in page cache case and shmem in swap cache case. It leads to xarray multi-index entry corruption, since it turns a sibling entry to a normal entry during xas_store() (see [1] for a userspace reproduction). Fix it by only using folio_test_swapcache() to determine whether xarray is storing swap cache entries or not to choose the right number of xarray entries to update. [1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/ Note: In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used to get swap_cache address space, but that ignores the shmem folio in swap cache case. It could lead to NULL pointer dereferencing when a in-swap-cache shmem folio is split at __xa_store(), since !folio_test_anon() is true and folio->mapping is NULL. But fortunately, its caller split_huge_page_to_list_to_order() bails out early with EBUSY when folio->mapping is NULL. So no need to take care of it here. Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly") Reported-by: Liu Shixin <liushixin2@huawei.com> Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/ Suggested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Zi Yan <ziy@nvidia.com> Cc: stable@vger.kernel.org --- mm/migrate.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)