diff mbox series

[v3] mm/migrate: fix shmem xarray update during migration

Message ID 20250305200403.2822855-1-ziy@nvidia.com (mailing list archive)
State New
Headers show
Series [v3] mm/migrate: fix shmem xarray update during migration | expand

Commit Message

Zi Yan March 5, 2025, 8:04 p.m. UTC
A shmem folio can be either in page cache or in swap cache, but not at the
same time. Namely, once it is in swap cache, folio->mapping should be NULL,
and the folio is no longer in a shmem mapping.

In __folio_migrate_mapping(), to determine the number of xarray entries
to update, folio_test_swapbacked() is used, but that conflates shmem in
page cache case and shmem in swap cache case. It leads to xarray
multi-index entry corruption, since it turns a sibling entry to a
normal entry during xas_store() (see [1] for a userspace reproduction).
Fix it by only using folio_test_swapcache() to determine whether xarray
is storing swap cache entries or not to choose the right number of xarray
entries to update.

[1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/

Note:
In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used
to get swap_cache address space, but that ignores the shmem folio in swap
cache case. It could lead to NULL pointer dereferencing when a
in-swap-cache shmem folio is split at __xa_store(), since
!folio_test_anon() is true and folio->mapping is NULL. But fortunately,
its caller split_huge_page_to_list_to_order() bails out early with EBUSY
when folio->mapping is NULL. So no need to take care of it here.

Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
Reported-by: Liu Shixin <liushixin2@huawei.com>
Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: stable@vger.kernel.org
---
 mm/migrate.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Matthew Wilcox March 5, 2025, 8:31 p.m. UTC | #1
On Wed, Mar 05, 2025 at 03:04:03PM -0500, Zi Yan wrote:
> A shmem folio can be either in page cache or in swap cache, but not at the
> same time. Namely, once it is in swap cache, folio->mapping should be NULL,
> and the folio is no longer in a shmem mapping.
> 
> In __folio_migrate_mapping(), to determine the number of xarray entries
> to update, folio_test_swapbacked() is used, but that conflates shmem in
> page cache case and shmem in swap cache case. It leads to xarray
> multi-index entry corruption, since it turns a sibling entry to a
> normal entry during xas_store() (see [1] for a userspace reproduction).
> Fix it by only using folio_test_swapcache() to determine whether xarray
> is storing swap cache entries or not to choose the right number of xarray
> entries to update.
> 
> [1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/
> 
> Note:
> In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used
> to get swap_cache address space, but that ignores the shmem folio in swap
> cache case. It could lead to NULL pointer dereferencing when a
> in-swap-cache shmem folio is split at __xa_store(), since
> !folio_test_anon() is true and folio->mapping is NULL. But fortunately,
> its caller split_huge_page_to_list_to_order() bails out early with EBUSY
> when folio->mapping is NULL. So no need to take care of it here.
> 
> Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
> Reported-by: Liu Shixin <liushixin2@huawei.com>
> Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/
> Suggested-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Baolin Wang March 8, 2025, 3:03 a.m. UTC | #2
On 2025/3/6 04:04, Zi Yan wrote:
> A shmem folio can be either in page cache or in swap cache, but not at the
> same time. Namely, once it is in swap cache, folio->mapping should be NULL,
> and the folio is no longer in a shmem mapping.
> 
> In __folio_migrate_mapping(), to determine the number of xarray entries
> to update, folio_test_swapbacked() is used, but that conflates shmem in
> page cache case and shmem in swap cache case. It leads to xarray
> multi-index entry corruption, since it turns a sibling entry to a
> normal entry during xas_store() (see [1] for a userspace reproduction).
> Fix it by only using folio_test_swapcache() to determine whether xarray
> is storing swap cache entries or not to choose the right number of xarray
> entries to update.
> 
> [1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/
> 
> Note:
> In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used
> to get swap_cache address space, but that ignores the shmem folio in swap
> cache case. It could lead to NULL pointer dereferencing when a
> in-swap-cache shmem folio is split at __xa_store(), since
> !folio_test_anon() is true and folio->mapping is NULL. But fortunately,
> its caller split_huge_page_to_list_to_order() bails out early with EBUSY
> when folio->mapping is NULL. So no need to take care of it here.
> 
> Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
> Reported-by: Liu Shixin <liushixin2@huawei.com>
> Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/
> Suggested-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Cc: stable@vger.kernel.org

Thanks for fixing the issue.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

> ---
>   mm/migrate.c | 10 ++++------
>   1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index fb4afd31baf0..c0adea67cd62 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -518,15 +518,13 @@ static int __folio_migrate_mapping(struct address_space *mapping,
>   	if (folio_test_anon(folio) && folio_test_large(folio))
>   		mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1);
>   	folio_ref_add(newfolio, nr); /* add cache reference */
> -	if (folio_test_swapbacked(folio)) {
> +	if (folio_test_swapbacked(folio))
>   		__folio_set_swapbacked(newfolio);
> -		if (folio_test_swapcache(folio)) {
> -			folio_set_swapcache(newfolio);
> -			newfolio->private = folio_get_private(folio);
> -		}
> +	if (folio_test_swapcache(folio)) {
> +		folio_set_swapcache(newfolio);
> +		newfolio->private = folio_get_private(folio);
>   		entries = nr;
>   	} else {
> -		VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
>   		entries = 1;
>   	}
>
Liu Shixin March 8, 2025, 3:17 a.m. UTC | #3
On 2025/3/6 4:04, Zi Yan wrote:
> A shmem folio can be either in page cache or in swap cache, but not at the
> same time. Namely, once it is in swap cache, folio->mapping should be NULL,
> and the folio is no longer in a shmem mapping.
>
> In __folio_migrate_mapping(), to determine the number of xarray entries
> to update, folio_test_swapbacked() is used, but that conflates shmem in
> page cache case and shmem in swap cache case. It leads to xarray
> multi-index entry corruption, since it turns a sibling entry to a
> normal entry during xas_store() (see [1] for a userspace reproduction).
> Fix it by only using folio_test_swapcache() to determine whether xarray
> is storing swap cache entries or not to choose the right number of xarray
> entries to update.
>
> [1] https://lore.kernel.org/linux-mm/Z8idPCkaJW1IChjT@casper.infradead.org/
>
> Note:
> In __split_huge_page(), folio_test_anon() && folio_test_swapcache() is used
> to get swap_cache address space, but that ignores the shmem folio in swap
> cache case. It could lead to NULL pointer dereferencing when a
> in-swap-cache shmem folio is split at __xa_store(), since
> !folio_test_anon() is true and folio->mapping is NULL. But fortunately,
> its caller split_huge_page_to_list_to_order() bails out early with EBUSY
> when folio->mapping is NULL. So no need to take care of it here.
>
> Fixes: fc346d0a70a1 ("mm: migrate high-order folios in swap cache correctly")
> Reported-by: Liu Shixin <liushixin2@huawei.com>
> Closes: https://lore.kernel.org/all/28546fb4-5210-bf75-16d6-43e1f8646080@huawei.com/
> Suggested-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Cc: stable@vger.kernel.org
Thanks for the patch, it works for me.
> ---
>  mm/migrate.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index fb4afd31baf0..c0adea67cd62 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -518,15 +518,13 @@ static int __folio_migrate_mapping(struct address_space *mapping,
>  	if (folio_test_anon(folio) && folio_test_large(folio))
>  		mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1);
>  	folio_ref_add(newfolio, nr); /* add cache reference */
> -	if (folio_test_swapbacked(folio)) {
> +	if (folio_test_swapbacked(folio))
>  		__folio_set_swapbacked(newfolio);
> -		if (folio_test_swapcache(folio)) {
> -			folio_set_swapcache(newfolio);
> -			newfolio->private = folio_get_private(folio);
> -		}
> +	if (folio_test_swapcache(folio)) {
> +		folio_set_swapcache(newfolio);
> +		newfolio->private = folio_get_private(folio);
>  		entries = nr;
>  	} else {
> -		VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
>  		entries = 1;
>  	}
>
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index fb4afd31baf0..c0adea67cd62 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -518,15 +518,13 @@  static int __folio_migrate_mapping(struct address_space *mapping,
 	if (folio_test_anon(folio) && folio_test_large(folio))
 		mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1);
 	folio_ref_add(newfolio, nr); /* add cache reference */
-	if (folio_test_swapbacked(folio)) {
+	if (folio_test_swapbacked(folio))
 		__folio_set_swapbacked(newfolio);
-		if (folio_test_swapcache(folio)) {
-			folio_set_swapcache(newfolio);
-			newfolio->private = folio_get_private(folio);
-		}
+	if (folio_test_swapcache(folio)) {
+		folio_set_swapcache(newfolio);
+		newfolio->private = folio_get_private(folio);
 		entries = nr;
 	} else {
-		VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
 		entries = 1;
 	}