From patchwork Thu Nov 25 06:45:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang Li X-Patchwork-Id: 12638481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5641CC433F5 for ; Thu, 25 Nov 2021 06:46:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C38DB6B0074; Thu, 25 Nov 2021 01:46:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BE8BF6B0075; Thu, 25 Nov 2021 01:46:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AB0226B007B; Thu, 25 Nov 2021 01:46:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0150.hostedemail.com [216.40.44.150]) by kanga.kvack.org (Postfix) with ESMTP id 9D4C26B0074 for ; Thu, 25 Nov 2021 01:46:29 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 61AB555FA4 for ; Thu, 25 Nov 2021 06:46:19 +0000 (UTC) X-FDA: 78846518478.06.DF953E9 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf16.hostedemail.com (Postfix) with ESMTP id A51ACF00009A for ; Thu, 25 Nov 2021 06:46:13 +0000 (UTC) Received: by mail-pj1-f50.google.com with SMTP id fv9-20020a17090b0e8900b001a6a5ab1392so4957575pjb.1 for ; Wed, 24 Nov 2021 22:46:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=AJKeYDChQLO4ew6FjZ6Y8Jm0rawpzT+mEb6zq+ylthQ=; b=OIJaXwOyYjpHI1G9DWSPynQ3Wdu8P1n50oK6pqqXtnBW2cVj+LLSV8VwhYefEO1SAY zrj9hgsv9jw7UlluOzPEu7+lVMz8wY8koTifDwOT+T3QHSFR9DPnaGO1xaCtf5w0Dg00 TSG99VmpJuAtUb4dc6viHVwBfOanRme39VVz8BBmW7joKZ+Z2CX/Q0M3zbcvvj7Et3Vz rZJNfbrXVxyOPLBug+WdLoOCpEqA3hIoXxe+lNCMZfO8xBP7DO9wrYj22LPHFhbxDgUM To993p4Ma5Llc13o4FQLjYRK/DtfqhfhW+C6nHT1ybduI1/wRJU0/jhqPfpDL/+zQ0Se nmlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=AJKeYDChQLO4ew6FjZ6Y8Jm0rawpzT+mEb6zq+ylthQ=; b=rLHrt8RrufxODMXgrhgsQVaAbty73tiwQYjoou4+hnprD2ratc2Nn/SlCOVbII1Vk8 SMusdFHkUQHzuaEumSpto6JiUa3NOZCCXis7XX9i1urzFK8dVeueO3VV/HsJTeOJ+4Ur Kxqtbxd7wgOJcEGjLlagJDtg+AqResrY/SdhS6TQq/quakNlqvLnrQoEAtvk7FJ29iim 3EJ9VMyoLLs5I8KNZ4vDqkWPwkwKtyhKTrNj/aTxN8cC79urnfIInE3AnTiNXBCdOEHN IHUG9Vv7t0Mz0aosnkXVfv9eVnUkiX00OIgbHrNEoomheYHMqooncwLhKXF/N9sN8W67 YWbA== X-Gm-Message-State: AOAM530QQn3+QEqiPjn3p/70Hkfk+K/uOZ77nbSqbqAwRN0laO5pxCdV dO7Bk5aEUn3PGU5sOec5vrfFJA== X-Google-Smtp-Source: ABdhPJwSiyVd0tMvU745BGU1v1Zfqqhru6Spx8LM0JTu1aWBPBkd1HNKcqGMPKfBIRAsFKd3azMakg== X-Received: by 2002:a17:902:ea10:b0:142:112d:c0b9 with SMTP id s16-20020a170902ea1000b00142112dc0b9mr26227788plg.35.1637822776457; Wed, 24 Nov 2021 22:46:16 -0800 (PST) Received: from C02FT5A6MD6R.bytedance.net ([61.120.150.76]) by smtp.gmail.com with ESMTPSA id j17sm2082294pfj.55.2021.11.24.22.46.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Nov 2021 22:46:15 -0800 (PST) From: Gang Li To: Hugh Dickins , Andrew Morton , "Kirill A. Shutemov" Cc: Gang Li , stable@vger.kernel.org, Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v5] shmem: fix a race between shmem_unused_huge_shrink and shmem_evict_inode Date: Thu, 25 Nov 2021 14:45:00 +0800 Message-Id: <20211125064502.99983-1-ligang.bdlg@bytedance.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: A51ACF00009A X-Stat-Signature: p97zbckcz6jakcic8wg9yci34dj6hriq Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=OIJaXwOy; dmarc=pass (policy=none) header.from=bytedance.com; spf=pass (imf16.hostedemail.com: domain of ligang.bdlg@bytedance.com designates 209.85.216.50 as permitted sender) smtp.mailfrom=ligang.bdlg@bytedance.com X-Rspamd-Server: rspam02 X-HE-Tag: 1637822773-78654 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch fixes a data race in commit 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure"). Here are call traces causing race: Call Trace 1: shmem_unused_huge_shrink+0x3ae/0x410 ? __list_lru_walk_one.isra.5+0x33/0x160 super_cache_scan+0x17c/0x190 shrink_slab.part.55+0x1ef/0x3f0 shrink_node+0x10e/0x330 kswapd+0x380/0x740 kthread+0xfc/0x130 ? mem_cgroup_shrink_node+0x170/0x170 ? kthread_create_on_node+0x70/0x70 ret_from_fork+0x1f/0x30 Call Trace 2: shmem_evict_inode+0xd8/0x190 evict+0xbe/0x1c0 do_unlinkat+0x137/0x330 do_syscall_64+0x76/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 A simple explanation: Image there are 3 items in the local list (@list). In the first traversal, A is not deleted from @list. 1) A->B->C ^ | pos (leave) In the second traversal, B is deleted from @list. Concurrently, A is deleted from @list through shmem_evict_inode() since last reference counter of inode is dropped by other thread. Then the @list is corrupted. 2) A->B->C ^ ^ | | evict pos (drop) We should make sure the inode is either on the global list or deleted from any local list before iput(). Fixed by moving inodes back to global list before we put them. Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Gang Li Reviewed-by: Muchun Song Acked-by: Kirill A. Shutemov --- Changes in v5: - Fix a compile warning Changes in v4: - Rework the comments Changes in v3: - Add more comment. - Use list_move(&info->shrinklist, &sbinfo->shrinklist) instead of list_move(pos, &sbinfo->shrinklist) for consistency. Changes in v2: https://lore.kernel.org/all/20211124030840.88455-1-ligang.bdlg@bytedance.com/ - Move spinlock to the front of iput instead of changing lock type since iput will call evict which may cause deadlock by requesting shrinklist_lock. - Add call trace in commit message. v1: https://lore.kernel.org/lkml/20211122064126.76734-1-ligang.bdlg@bytedance.com/ --- mm/shmem.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 9023103ee7d8..a6487fe0583f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -554,7 +554,7 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, struct shmem_inode_info *info; struct page *page; unsigned long batch = sc ? sc->nr_to_scan : 128; - int removed = 0, split = 0; + int split = 0; if (list_empty(&sbinfo->shrinklist)) return SHRINK_STOP; @@ -569,7 +569,6 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, /* inode is about to be evicted */ if (!inode) { list_del_init(&info->shrinklist); - removed++; goto next; } @@ -577,12 +576,12 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, if (round_up(inode->i_size, PAGE_SIZE) == round_up(inode->i_size, HPAGE_PMD_SIZE)) { list_move(&info->shrinklist, &to_remove); - removed++; goto next; } list_move(&info->shrinklist, &list); next: + sbinfo->shrinklist_len--; if (!--batch) break; } @@ -602,7 +601,7 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, inode = &info->vfs_inode; if (nr_to_split && split >= nr_to_split) - goto leave; + goto move_back; page = find_get_page(inode->i_mapping, (inode->i_size & HPAGE_PMD_MASK) >> PAGE_SHIFT); @@ -616,38 +615,43 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, } /* - * Leave the inode on the list if we failed to lock - * the page at this time. + * Move the inode on the list back to shrinklist if we failed + * to lock the page at this time. * * Waiting for the lock may lead to deadlock in the * reclaim path. */ if (!trylock_page(page)) { put_page(page); - goto leave; + goto move_back; } ret = split_huge_page(page); unlock_page(page); put_page(page); - /* If split failed leave the inode on the list */ + /* If split failed move the inode on the list back to shrinklist */ if (ret) - goto leave; + goto move_back; split++; drop: list_del_init(&info->shrinklist); - removed++; -leave: + goto put; +move_back: + /* + * Make sure the inode is either on the global list or deleted from + * any local list before iput() since it could be deleted in another + * thread once we put the inode (then the local list is corrupted). + */ + spin_lock(&sbinfo->shrinklist_lock); + list_move(&info->shrinklist, &sbinfo->shrinklist); + sbinfo->shrinklist_len++; + spin_unlock(&sbinfo->shrinklist_lock); +put: iput(inode); } - spin_lock(&sbinfo->shrinklist_lock); - list_splice_tail(&list, &sbinfo->shrinklist); - sbinfo->shrinklist_len -= removed; - spin_unlock(&sbinfo->shrinklist_lock); - return split; }