From patchwork Wed Jan 24 09:14:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528836 Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E35C717BD9; Wed, 24 Jan 2024 09:18:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.191 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087936; cv=none; b=nXgwP/cwGY2ECHOSqQM6gUeDxixs+DiybRgNEEy7gypjC1FNVBOQdTMRTJ+yxDCRShgLgSIQ3BCQfFq5HMj7Z4xWnBMayd/yp/M+odwuwRwm7uIix79sKDOoNJWQZ0Ae0yB1CbeKA5Olicch7ZNORII1xigMfhr4sidGuM7hmYs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087936; c=relaxed/simple; bh=2sguAUmXQJy+/hHUZSNRGWcK1pkkMCFPkqUUzc67Cms=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DtwsReNc/A0UuadZaPRoWmmvQZIJKLty8R4RMV1ra7MWIfDGPkRIsUiKn70Tih6dHjSQ7engkX37JjwDp7RJ55DQHbw6Jy8I9prnH3Igvps2dJKGaR3n8LjlqRv9CLj0kgXaZOGgDB25xSzgC2JI4ZFBnIIILuFfCxUgvABNGU4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.191 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4TKdZ95hK4z1FJkm; Wed, 24 Jan 2024 17:14:25 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 463891A016E; Wed, 24 Jan 2024 17:18:32 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:17 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 01/11] md: don't ignore suspended array in md_check_recovery() Date: Wed, 24 Jan 2024 17:14:11 +0800 Message-ID: <20240124091421.1261579-2-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) mddev_suspend() never stop sync_thread, hence it doesn't make sense to ignore suspended array in md_check_recovery(), which might cause sync_thread can't be unregistered. After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following hang can be triggered by test shell/integrity-caching.sh: 1) suspend the array: raid_postsuspend mddev_suspend 2) stop the array: raid_dtr md_stop __md_stop_writes stop_sync_thread set_bit(MD_RECOVERY_INTR, &mddev->recovery); md_wakeup_thread_directly(mddev->sync_thread); wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) 3) sync thread done: md_do_sync set_bit(MD_RECOVERY_DONE, &mddev->recovery); md_wakeup_thread(mddev->thread); 4) daemon thread can't unregister sync thread: md_check_recovery if (mddev->suspended) return; -> return directly md_read_sync_thread clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery); -> MD_RECOVERY_RUNNING can't be cleared, hence step 2 hang; This problem is not just related to dm-raid, fix it by ignoring suspended array in md_check_recovery(). And follow up patches will improve dm-raid better to frozen sync thread during suspend. Reported-by: Mikulas Patocka Closes: https://lore.kernel.org/all/8fb335e-6d2c-dbb5-d7-ded8db5145a@redhat.com/ Fixes: 68866e425be2 ("MD: no sync IO while suspended") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai --- drivers/md/md.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 2266358d8074..07b80278eaa5 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9469,9 +9469,6 @@ static void md_start_sync(struct work_struct *ws) */ void md_check_recovery(struct mddev *mddev) { - if (READ_ONCE(mddev->suspended)) - return; - if (mddev->bitmap) md_bitmap_daemon_work(mddev); From patchwork Wed Jan 24 09:14:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528837 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD2DA17C66; Wed, 24 Jan 2024 09:19:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087952; cv=none; b=hiXbemby3h0ZJP1ZmEDSD0uL5Ei4p8obgkJ6l9ooMRiR/TwD1a2dhO5lvX0SE4DaxAUJMamkuQyQGGbKAC6yaKeRas/xc1fUGDRcuaZBeFNZz1IG9vWnDWp53texwkBHJ0ifsxLYeVInS1lLjh8Imt6R8XZ2TZQZI2xqAlRRQLg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087952; c=relaxed/simple; bh=X18KtNk96NC0AqflJWJCrao6h9w3a7IJOBSuBGFV//o=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VexUMXlBCscyq421aCQhVU2rUO+ym5hGkpSGFDIBkH9Cn0BxzbvKc+NU0txWlqUoLiGycxray8JnKimCrasQsvqmA5Wf4Gh4SdvcNHvtWvm9xQmGxxgvRLS1GBFAz/xqNyoXLGiDXTEaCaMQAuyu956PsUZ4y9RM2nOdMMOQS5M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4TKdfT6qvxz1xmZt; Wed, 24 Jan 2024 17:18:09 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 53C0F1A017B; Wed, 24 Jan 2024 17:18:47 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:18 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 02/11] md: don't ignore read-only array in md_check_recovery() Date: Wed, 24 Jan 2024 17:14:12 +0800 Message-ID: <20240124091421.1261579-3-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) Usually if the array is not read-write, md_check_recovery() won't register new sync_thread in the first place. And if the array is read-write and sync_thread is registered, md_set_readonly() will unregister sync_thread before setting the array read-only. md/raid follow this behavior hence there is no problem. After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following hang can be triggered by test shell/integrity-caching.sh: 1) array is read-only. dm-raid update super block: rs_update_sbs ro = mddev->ro mddev->ro = 0 -> set array read-write md_update_sb 2) register new sync thread concurrently. 3) dm-raid set array back to read-only: rs_update_sbs mddev->ro = ro 4) stop the array: raid_dtr md_stop stop_sync_thread set_bit(MD_RECOVERY_INTR, &mddev->recovery); md_wakeup_thread_directly(mddev->sync_thread); wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) 5) sync thread done: md_do_sync set_bit(MD_RECOVERY_DONE, &mddev->recovery); md_wakeup_thread(mddev->thread); 6) daemon thread can't unregister sync thread: md_check_recovery if (!md_is_rdwr(mddev) && !test_bit(MD_RECOVERY_NEEDED, &mddev->recovery)) return; -> -> MD_RECOVERY_RUNNING can't be cleared, hence step 4 hang; The root cause is that dm-raid manipulate 'mddev->ro' by itself, however, dm-raid really should stop sync thread before setting the array read-only. Unfortunately, I need to read more code before I can refacter the handler of 'mddev->ro' in dm-raid, hence let's fix the problem the easy way for now to prevent dm-raid regression. Reported-by: Mikulas Patocka Closes: https://lore.kernel.org/all/9801e40-8ac7-e225-6a71-309dcf9dc9aa@redhat.com/ Fixes: ecbfb9f118bc ("dm raid: add raid level takeover support") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai --- drivers/md/md.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 07b80278eaa5..6906d023f1d6 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9445,6 +9445,20 @@ static void md_start_sync(struct work_struct *ws) sysfs_notify_dirent_safe(mddev->sysfs_action); } +static void unregister_sync_thread(struct mddev *mddev) +{ + if (!test_bit(MD_RECOVERY_DONE, &mddev->recovery)) { + /* resync/recovery still happening */ + clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + return; + } + + if (WARN_ON_ONCE(!mddev->sync_thread)) + return; + + md_reap_sync_thread(mddev); +} + /* * This routine is regularly called by all per-raid-array threads to * deal with generic issues like resync and super-block update. @@ -9482,7 +9496,8 @@ void md_check_recovery(struct mddev *mddev) } if (!md_is_rdwr(mddev) && - !test_bit(MD_RECOVERY_NEEDED, &mddev->recovery)) + !test_bit(MD_RECOVERY_NEEDED, &mddev->recovery) && + !test_bit(MD_RECOVERY_DONE, &mddev->recovery)) return; if ( ! ( (mddev->sb_flags & ~ (1<recovery)) { - /* sync_work already queued. */ - clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + unregister_sync_thread(mddev); goto unlock; } @@ -9568,16 +9582,7 @@ void md_check_recovery(struct mddev *mddev) * still set. */ if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { - if (!test_bit(MD_RECOVERY_DONE, &mddev->recovery)) { - /* resync/recovery still happening */ - clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - goto unlock; - } - - if (WARN_ON_ONCE(!mddev->sync_thread)) - goto unlock; - - md_reap_sync_thread(mddev); + unregister_sync_thread(mddev); goto unlock; } From patchwork Wed Jan 24 09:14:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528838 Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99D0D17BC6; Wed, 24 Jan 2024 09:19:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.191 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087961; cv=none; b=J64Ua011OzMDLoAv9HHKTrO/epFTZQ7NnxbAvUALxxMf1ANx5uSmTs3+yDZ9dkFe6UiMQtaO4jXiHdMkov5zpWONzHpu79EJt67VL1w44JFS5wWj1ZQV0X5xiRmsoCgHvtGbKGL6blYXLx7XD4sAwysfLuW9dX9RiHIQnHr6v5k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087961; c=relaxed/simple; bh=UUDMydkf3OhAgyA/75Pu09FbhTIJc5xXjjkQoQ6FzIE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tfo9ZFddZuCB9ShYuOfubArgJ4vK7AIQV8VT47/V47VydRNuabirXUtD6wsMOn4SJ27ETXkc12QL0Ou8k9u9toKyzAyo9B003u6v3MmdCsnvNu94piIWDhZEMnFu+J6o1CMCUnyB6Uqx5vGDT+CZX2RgSYAYID0dDMVklTpYEI4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.191 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4TKddn2Vtlz1gxy6; Wed, 24 Jan 2024 17:17:33 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 616B31A016C; Wed, 24 Jan 2024 17:19:02 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:19 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 03/11] md: make sure md_do_sync() will set MD_RECOVERY_DONE Date: Wed, 24 Jan 2024 17:14:13 +0800 Message-ID: <20240124091421.1261579-4-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) stop_sync_thread() will interrupt md_do_sync(), and md_do_sync() must set MD_RECOVERY_DONE, so that follow up md_check_recovery() will unregister sync_thread, clear MD_RECOVERY_RUNNING and wake up stop_sync_thread(). If MD_RECOVERY_WAIT is set or the array is read-only, md_do_sync() will return without setting MD_RECOVERY_DONE, and after commit f52f5c71f3d4 ("md: fix stopping sync thread"), dm-raid switch from md_reap_sync_thread() to stop_sync_thread() to unregister sync_thread from md_stop() and md_stop_writes(), causing the test shell/lvconvert-raid-reshape.sh hang. We shouldn't switch back to md_reap_sync_thread() because it's problematic in the first place. Fix the problem by making sure md_do_sync() will set MD_RECOVERY_DONE. Reported-by: Mikulas Patocka Closes: https://lore.kernel.org/all/ece2b06f-d647-6613-a534-ff4c9bec1142@redhat.com/ Fixes: d5d885fd514f ("md: introduce new personality funciton start()") Fixes: 5fd6c1dce06e ("[PATCH] md: allow checkpoint of recovery with version-1 superblock") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai --- drivers/md/md.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 6906d023f1d6..c65dfd156090 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -8788,12 +8788,16 @@ void md_do_sync(struct md_thread *thread) int ret; /* just incase thread restarts... */ - if (test_bit(MD_RECOVERY_DONE, &mddev->recovery) || - test_bit(MD_RECOVERY_WAIT, &mddev->recovery)) + if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) return; - if (!md_is_rdwr(mddev)) {/* never try to sync a read-only array */ + + if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) + goto skip; + + if (test_bit(MD_RECOVERY_WAIT, &mddev->recovery) || + !md_is_rdwr(mddev)) {/* never try to sync a read-only array */ set_bit(MD_RECOVERY_INTR, &mddev->recovery); - return; + goto skip; } if (mddev_is_clustered(mddev)) { From patchwork Wed Jan 24 09:14:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528839 Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53F5B1803A; Wed, 24 Jan 2024 09:19:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.35 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087976; cv=none; b=BH9XU/qZWioY4qyaB7AeyADlGEY8irzJxeQijKQ4PAfwzTHsq/x0gnp30nkbBsJOjK2g1u+prAwsyWC5PVU7lySVfm8nnkFw/lLMxU9y8Dtewse8QC6Bi7jyyp5b5iCFRi+VtjiMuQV84IAqLWk8E9MVwDIpWHIn2a6dWJvXTUo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087976; c=relaxed/simple; bh=dbDQaIVHXUE8dmHiG0RZLAFNbkTEDZ50EE8O+5/2iRY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VdyKS5sIUnnpKGYkrvm1AO+0p2LGBGci0x4PBCYwZBCClei62BtJt5/LiApRI3yb/Afno0LE2FLgDONiv06b3uvBF8YtjDLC8tO6MxDe6++p9FmTWLUuSr2ieucDjKTqGf6DY0BroKtIv4ob/9gM7rM2fsxq9k0DXyd+Xsr97vA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4TKdf36Hzwz1Q83V; Wed, 24 Jan 2024 17:17:47 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 6F0FE1A0176; Wed, 24 Jan 2024 17:19:17 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:20 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 04/11] md: don't register sync_thread for reshape directly Date: Wed, 24 Jan 2024 17:14:14 +0800 Message-ID: <20240124091421.1261579-5-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) Currently, if reshape is interrupted, then reassemble the array will register sync_thread directly from pers->run(), in this case 'MD_RECOVERY_RUNNING' is set directly, however, there is no guarantee that md_do_sync() will be executed, hence stop_sync_thread() will hang because 'MD_RECOVERY_RUNNING' can't be cleared. Last patch make sure that md_do_sync() will set MD_RECOVERY_DONE, however, following hang can still be triggered by dm-raid test shell/lvconvert-raid-reshape.sh occasionally: [root@fedora ~]# cat /proc/1982/stack [<0>] stop_sync_thread+0x1ab/0x270 [md_mod] [<0>] md_frozen_sync_thread+0x5c/0xa0 [md_mod] [<0>] raid_presuspend+0x1e/0x70 [dm_raid] [<0>] dm_table_presuspend_targets+0x40/0xb0 [dm_mod] [<0>] __dm_destroy+0x2a5/0x310 [dm_mod] [<0>] dm_destroy+0x16/0x30 [dm_mod] [<0>] dev_remove+0x165/0x290 [dm_mod] [<0>] ctl_ioctl+0x4bb/0x7b0 [dm_mod] [<0>] dm_ctl_ioctl+0x11/0x20 [dm_mod] [<0>] vfs_ioctl+0x21/0x60 [<0>] __x64_sys_ioctl+0xb9/0xe0 [<0>] do_syscall_64+0xc6/0x230 [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74 Meanwhile mddev->recovery is: MD_RECOVERY_RUNNING | MD_RECOVERY_INTR | MD_RECOVERY_RESHAPE | MD_RECOVERY_FROZEN Fix this problem by remove the code to register sync_thread directly from raid10 and raid5. And let md_check_recovery() to register sync_thread. Fixes: f67055780caa ("[PATCH] md: Checkpoint and allow restart of raid5 reshape") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai --- drivers/md/md.c | 5 ++++- drivers/md/raid10.c | 16 ++-------------- drivers/md/raid5.c | 29 ++--------------------------- 3 files changed, 8 insertions(+), 42 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index c65dfd156090..6c5d0a372927 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9372,6 +9372,7 @@ static void md_start_sync(struct work_struct *ws) struct mddev *mddev = container_of(ws, struct mddev, sync_work); int spares = 0; bool suspend = false; + char *name; if (md_spares_need_change(mddev)) suspend = true; @@ -9404,8 +9405,10 @@ static void md_start_sync(struct work_struct *ws) if (spares) md_bitmap_write_all(mddev->bitmap); + name = test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) ? + "reshape" : "resync"; rcu_assign_pointer(mddev->sync_thread, - md_register_thread(md_do_sync, mddev, "resync")); + md_register_thread(md_do_sync, mddev, name)); if (!mddev->sync_thread) { pr_warn("%s: could not start resync thread...\n", mdname(mddev)); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 7412066ea22c..a5f8419e2df1 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -4175,11 +4175,7 @@ static int raid10_run(struct mddev *mddev) clear_bit(MD_RECOVERY_SYNC, &mddev->recovery); clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); - set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - rcu_assign_pointer(mddev->sync_thread, - md_register_thread(md_do_sync, mddev, "reshape")); - if (!mddev->sync_thread) - goto out_free_conf; + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); } return 0; @@ -4573,16 +4569,8 @@ static int raid10_start_reshape(struct mddev *mddev) clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); clear_bit(MD_RECOVERY_DONE, &mddev->recovery); set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); - set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - - rcu_assign_pointer(mddev->sync_thread, - md_register_thread(md_do_sync, mddev, "reshape")); - if (!mddev->sync_thread) { - ret = -EAGAIN; - goto abort; - } + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); conf->reshape_checkpoint = jiffies; - md_wakeup_thread(mddev->sync_thread); md_new_event(); return 0; diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 8497880135ee..6a7a32f7fb91 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7936,11 +7936,7 @@ static int raid5_run(struct mddev *mddev) clear_bit(MD_RECOVERY_SYNC, &mddev->recovery); clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); - set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - rcu_assign_pointer(mddev->sync_thread, - md_register_thread(md_do_sync, mddev, "reshape")); - if (!mddev->sync_thread) - goto abort; + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); } /* Ok, everything is just fine now */ @@ -8506,29 +8502,8 @@ static int raid5_start_reshape(struct mddev *mddev) clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); clear_bit(MD_RECOVERY_DONE, &mddev->recovery); set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); - set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - rcu_assign_pointer(mddev->sync_thread, - md_register_thread(md_do_sync, mddev, "reshape")); - if (!mddev->sync_thread) { - mddev->recovery = 0; - spin_lock_irq(&conf->device_lock); - write_seqcount_begin(&conf->gen_lock); - mddev->raid_disks = conf->raid_disks = conf->previous_raid_disks; - mddev->new_chunk_sectors = - conf->chunk_sectors = conf->prev_chunk_sectors; - mddev->new_layout = conf->algorithm = conf->prev_algo; - rdev_for_each(rdev, mddev) - rdev->new_data_offset = rdev->data_offset; - smp_wmb(); - conf->generation --; - conf->reshape_progress = MaxSector; - mddev->reshape_position = MaxSector; - write_seqcount_end(&conf->gen_lock); - spin_unlock_irq(&conf->device_lock); - return -EAGAIN; - } + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); conf->reshape_checkpoint = jiffies; - md_wakeup_thread(mddev->sync_thread); md_new_event(); return 0; } From patchwork Wed Jan 24 09:14:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528840 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 974C217C67; Wed, 24 Jan 2024 09:19:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087991; cv=none; b=ah6oEY/yEnM+qfb2YRw7xCPEch7Cyr1/TivsBEcjDRbCq6FpxF8cICFya6Fyeli+qBH1C6RMujeAPoMr4ti+Gs4MYWxLtVtKm/2MXr9TO+vPrqrAc6QGnndj9QkGXTeu4KXWZ1u3mUjheujGffe8/5LqXGS+7Ls6Z1GcGSBPnlM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706087991; c=relaxed/simple; bh=zbq5mTq1vwpbLsA7BkaKJDAM1FT2S8s+HUXg+ohYWY4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GQ45zWEkvCefaJfzeceRqJ6T1HOHwfb1iGiBhv5DFLW/x1Judj6pHT1QvNcyTmATdB9v0lvHoVBaL3vG1CXtE+MGMxLoQ68Gfpw5aYC8rmJuJrTHnDXr9Hle2r6viNW7H+8he4ckSSZJ7sxFhzGrCpt4e5Qmb9wTDdf+ZSL9yRs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4TKdgM0wsNz1xmWl; Wed, 24 Jan 2024 17:18:55 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 7D0591A017B; Wed, 24 Jan 2024 17:19:32 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:21 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 05/11] md: export helpers to stop sync_thread Date: Wed, 24 Jan 2024 17:14:15 +0800 Message-ID: <20240124091421.1261579-6-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) The new heleprs will be used in dm-raid in later patches to fix regressions and prevent calling md_reap_sync_thread() directly. Signed-off-by: Yu Kuai --- drivers/md/md.c | 41 +++++++++++++++++++++++++++++++++++++---- drivers/md/md.h | 3 +++ 2 files changed, 40 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 6c5d0a372927..90cf31b53804 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4915,30 +4915,63 @@ static void stop_sync_thread(struct mddev *mddev, bool locked, bool check_seq) mddev_lock_nointr(mddev); } -static void idle_sync_thread(struct mddev *mddev) +void md_idle_sync_thread(struct mddev *mddev) { + lockdep_assert_held(mddev->reconfig_mutex); + mutex_lock(&mddev->sync_mutex); clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + stop_sync_thread(mddev, true, true); + mutex_unlock(&mddev->sync_mutex); +} +EXPORT_SYMBOL_GPL(md_idle_sync_thread); + +void md_frozen_sync_thread(struct mddev *mddev) +{ + lockdep_assert_held(mddev->reconfig_mutex); + + mutex_lock(&mddev->sync_mutex); + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + stop_sync_thread(mddev, true, false); + mutex_unlock(&mddev->sync_mutex); +} +EXPORT_SYMBOL_GPL(md_frozen_sync_thread); +void md_unfrozen_sync_thread(struct mddev *mddev) +{ + lockdep_assert_held(mddev->reconfig_mutex); + + mutex_lock(&mddev->sync_mutex); + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + md_wakeup_thread(mddev->thread); + sysfs_notify_dirent_safe(mddev->sysfs_action); + mutex_unlock(&mddev->sync_mutex); +} +EXPORT_SYMBOL_GPL(md_unfrozen_sync_thread); + +static void idle_sync_thread(struct mddev *mddev) +{ if (mddev_lock(mddev)) { mutex_unlock(&mddev->sync_mutex); return; } + mutex_lock(&mddev->sync_mutex); + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); stop_sync_thread(mddev, false, true); mutex_unlock(&mddev->sync_mutex); } static void frozen_sync_thread(struct mddev *mddev) { - mutex_lock(&mddev->sync_mutex); - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - if (mddev_lock(mddev)) { mutex_unlock(&mddev->sync_mutex); return; } + mutex_lock(&mddev->sync_mutex); + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); stop_sync_thread(mddev, false, false); mutex_unlock(&mddev->sync_mutex); } diff --git a/drivers/md/md.h b/drivers/md/md.h index 8d881cc59799..437ab70ce79b 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -781,6 +781,9 @@ extern void md_rdev_clear(struct md_rdev *rdev); extern void md_handle_request(struct mddev *mddev, struct bio *bio); extern int mddev_suspend(struct mddev *mddev, bool interruptible); extern void mddev_resume(struct mddev *mddev); +extern void md_idle_sync_thread(struct mddev *mddev); +extern void md_frozen_sync_thread(struct mddev *mddev); +extern void md_unfrozen_sync_thread(struct mddev *mddev); extern void md_reload_sb(struct mddev *mddev, int raid_disk); extern void md_update_sb(struct mddev *mddev, int force); From patchwork Wed Jan 24 09:14:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528841 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89E1317C60; Wed, 24 Jan 2024 09:20:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088007; cv=none; b=UEwXs4bfgB99OQAxHrjdEdCC5rJvt9mnUyAvaniDti6SJbT59xCNnP7cFOLQ2YPuTBK7/gBxxZOZiMg5IEbem0mM1c8MWFM8VvV50M7N8RqPif5nvZBrQMfDBFgrtr36b49oUEteD+Lo4nPNsh89YxTrXHtRgd7tbcSb6LLX8B4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088007; c=relaxed/simple; bh=NnnCt3x2MjeWU+TuhRK8/jx2gmdbjoesdllieJVuNzE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Q2YUnxbE2zOiTMe3dI0w5Y44YWMaHlTeaEOdzlzKpUhsAkLyiwiRTIUJhrH6acdVSbOMtrIJtru2vyBhjiypMYKySR8CPH1suinwjDG13pLkskBEl2tbZ83C5ZFC0OBB2AzAzfR4kFySAtSA2kMJ4m5pRevDmXplq+9Ey7lZutc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4TKdff2KZwz29kjF; Wed, 24 Jan 2024 17:18:18 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 8AB011A016B; Wed, 24 Jan 2024 17:19:47 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:22 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 06/11] dm-raid: really frozen sync_thread during suspend Date: Wed, 24 Jan 2024 17:14:16 +0800 Message-ID: <20240124091421.1261579-7-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) 1) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen, it only prevent new sync_thread to start, and it can't stop the running sync thread; 2) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use it as condition for md_stop_writes() in raid_postsuspend() doesn't look correct. 3) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime, and if MD_RECOVERY_FROZEN is cleared while the array is suspended, new sync_thread can start unexpected. Fix above problems by using the new helper to suspend the array during suspend, also disallow raid_message() to change sync_thread status during suspend. Note that after commit f52f5c71f3d4 ("md: fix stopping sync thread"), the test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(), and with previous fixes, the test won't hang there anymore, however, the test will still fail and complain that ext4 is corrupted. And with this patch, the test won't hang due to stop_sync_thread() or fail due to ext4 is corrupted anymore. However, there is still a deadlock related to dm-raid456 that will be fixed in following patches. Reported-by: Mikulas Patocka Closes: https://lore.kernel.org/all/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@redhat.com/ Fixes: 1af2048a3e87 ("dm raid: fix deadlock caused by premature md_stop_writes()") Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 38 +++++++++++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index eb009d6bb03a..5ce3c6020b1b 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3240,11 +3240,12 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) rs->md.ro = 1; rs->md.in_sync = 1; - /* Keep array frozen until resume. */ - set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery); - /* Has to be held on running the array */ mddev_suspend_and_lock_nointr(&rs->md); + + /* Keep array frozen until resume. */ + md_frozen_sync_thread(&rs->md); + r = md_run(&rs->md); rs->md.in_sync = 0; /* Assume already marked dirty */ if (r) { @@ -3722,6 +3723,9 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv, if (!mddev->pers || !mddev->pers->sync_request) return -EINVAL; + if (test_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) + return -EBUSY; + if (!strcasecmp(argv[0], "frozen")) set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); else @@ -3791,15 +3795,31 @@ static void raid_io_hints(struct dm_target *ti, struct queue_limits *limits) blk_limits_io_opt(limits, chunk_size_bytes * mddev_data_stripes(rs)); } +static void raid_presuspend(struct dm_target *ti) +{ + struct raid_set *rs = ti->private; + + mddev_lock_nointr(&rs->md); + md_frozen_sync_thread(&rs->md); + mddev_unlock(&rs->md); +} + +static void raid_presuspend_undo(struct dm_target *ti) +{ + struct raid_set *rs = ti->private; + + mddev_lock_nointr(&rs->md); + md_unfrozen_sync_thread(&rs->md); + mddev_unlock(&rs->md); +} + static void raid_postsuspend(struct dm_target *ti) { struct raid_set *rs = ti->private; if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) { /* Writes have to be stopped before suspending to avoid deadlocks. */ - if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery)) - md_stop_writes(&rs->md); - + md_stop_writes(&rs->md); mddev_suspend(&rs->md, false); } } @@ -4012,8 +4032,6 @@ static int raid_preresume(struct dm_target *ti) } /* Check for any resize/reshape on @rs and adjust/initiate */ - /* Be prepared for mddev_resume() in raid_resume() */ - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) { set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery); mddev->resync_min = mddev->recovery_cp; @@ -4056,9 +4074,9 @@ static void raid_resume(struct dm_target *ti) rs_set_capacity(rs); mddev_lock_nointr(mddev); - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); mddev->ro = 0; mddev->in_sync = 0; + md_unfrozen_sync_thread(mddev); mddev_unlock_and_resume(mddev); } } @@ -4074,6 +4092,8 @@ static struct target_type raid_target = { .message = raid_message, .iterate_devices = raid_iterate_devices, .io_hints = raid_io_hints, + .presuspend = raid_presuspend, + .presuspend_undo = raid_presuspend_undo, .postsuspend = raid_postsuspend, .preresume = raid_preresume, .resume = raid_resume, From patchwork Wed Jan 24 09:14:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528842 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B21DD1946B; Wed, 24 Jan 2024 09:20:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088021; cv=none; b=p58wKP0zntS+Hq/qP1feSwWm5lkzhvO4sM4gh9BKe0maP14wyuq4oKSuYK4B7xreKyqPoM1UlrOhdleO6bXwyrrH5RJ+I1dHuSmnLfga2WZ0E/y+/O7kHXeO7zc0IyUKilysK5kykPo+SRY2+PndvxCFMiESip9lOoZF5c5Pf8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088021; c=relaxed/simple; bh=kJ1nYmW4cicbBwQfSgo5X6zUUeLx3T+g5jG9se6tsaM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=m75eKYQdjrhjGTt9Yn6xguJyfeI1jhahqwRM0PYL7AtDuc6b3O91jrT8MNhL2YVhBMvMxJd63kBJNAzcyB5F0mkMK2J9phIeJkQUbMlg5jUru9u94ogVFChVWflgTA+/yBKBE3OkfdEaPVOoOPzx/3x4ZW/mXpq+MfymYOe4844= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4TKdfx2jxGz29kj6; Wed, 24 Jan 2024 17:18:33 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id 987251A0172; Wed, 24 Jan 2024 17:20:02 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:24 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 07/11] md/dm-raid: don't call md_reap_sync_thread() directly Date: Wed, 24 Jan 2024 17:14:17 +0800 Message-ID: <20240124091421.1261579-8-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) Currently md_reap_sync_thread() is called from raid_message() directly without holding 'reconfig_mutex', this is definitely unsafe because md_reap_sync_thread() can change many fields that is protected by 'reconfig_mutex'. However, hold 'reconfig_mutex' here is still problematic because this will cause deadlock, for example, commit 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock"). Fix this problem by using stop_sync_thread() to unregister sync_thread, like md/raid did. Fixes: be83651f0050 ("DM RAID: Add message/status support for changing sync action") Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 5ce3c6020b1b..6b6c011d9f69 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3719,6 +3719,7 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv, { struct raid_set *rs = ti->private; struct mddev *mddev = &rs->md; + int ret = 0; if (!mddev->pers || !mddev->pers->sync_request) return -EINVAL; @@ -3726,17 +3727,24 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv, if (test_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) return -EBUSY; - if (!strcasecmp(argv[0], "frozen")) - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - else - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + if (!strcasecmp(argv[0], "frozen")) { + ret = mddev_lock(mddev); + if (ret) + return ret; - if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) { - if (mddev->sync_thread) { - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - md_reap_sync_thread(mddev); - } - } else if (decipher_sync_action(mddev, mddev->recovery) != st_idle) + md_frozen_sync_thread(mddev); + mddev_unlock(mddev); + } else if (!strcasecmp(argv[0], "idle")) { + ret = mddev_lock(mddev); + if (ret) + return ret; + + md_idle_sync_thread(mddev); + mddev_unlock(mddev); + } + + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + if (decipher_sync_action(mddev, mddev->recovery) != st_idle) return -EBUSY; else if (!strcasecmp(argv[0], "resync")) ; /* MD_RECOVERY_NEEDED set below */ From patchwork Wed Jan 24 09:14:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528843 Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8544417C95; Wed, 24 Jan 2024 09:20:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.35 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088050; cv=none; b=KxRegma7OwY6+ek1ruU3ytnod+zElFpWprgFX19cenkkDw3GYMv8VxlhZeLXYUBYDTTn0To8MPr7GyPmAJu+lgTeMZCzo4BEgL/beHKOua39zF4RwF4aEwrKD0CMZqttuUAcCvVHYGgZaNh91D4x57HoiS+Xk3L3DQLLBrAdwWc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088050; c=relaxed/simple; bh=khR8VN0nW3AZEac17bucoey0NPJ0wJECbGw6pkiHPS8=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CebzQo3+cG0VjBvj3ezO2bmegCUABISm6QHMguDa5Yf/etvJOhlGQu8svIZbRT3bq2Xeq9JP8za3qUNrJrNPOVba+AV9gkwP1eG50N6eI8c/XJ/QW4SjLBWoP8ereZeOs5BGeMT8hElY7RXUL8LuPSxdfIKz99/rjjwPNHPHn9I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4TKdgW1cgJz1Q85S; Wed, 24 Jan 2024 17:19:03 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id BF5EF1A0171; Wed, 24 Jan 2024 17:20:32 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:25 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 08/11] dm-raid: remove mddev_suspend/resume() Date: Wed, 24 Jan 2024 17:14:18 +0800 Message-ID: <20240124091421.1261579-9-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) dm_suspend() already make sure that no new IO can be issued and will wait for all dispatched IO to be done. There is no need to call mddev_suspend() to make sure that again. Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 6b6c011d9f69..f1637cf88559 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3241,7 +3241,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv) rs->md.in_sync = 1; /* Has to be held on running the array */ - mddev_suspend_and_lock_nointr(&rs->md); + mddev_lock_nointr(&rs->md); /* Keep array frozen until resume. */ md_frozen_sync_thread(&rs->md); @@ -3825,11 +3825,9 @@ static void raid_postsuspend(struct dm_target *ti) { struct raid_set *rs = ti->private; - if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) { + if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) /* Writes have to be stopped before suspending to avoid deadlocks. */ md_stop_writes(&rs->md); - mddev_suspend(&rs->md, false); - } } static void attempt_restore_of_faulty_devices(struct raid_set *rs) @@ -4085,7 +4083,7 @@ static void raid_resume(struct dm_target *ti) mddev->ro = 0; mddev->in_sync = 0; md_unfrozen_sync_thread(mddev); - mddev_unlock_and_resume(mddev); + mddev_unlock(mddev); } } From patchwork Wed Jan 24 09:14:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528844 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F09517C96; Wed, 24 Jan 2024 09:21:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088066; cv=none; b=p2Qv9QLZUjFSSS+bYPfd5I+9iOZmEzn8rYFKgaP2kw588qeI912nF5obrWLcU6qU0HgS9jHp91tm2ZPtGBROrbwrrxs1+bC/KTfRrIWEYKwk+HwkJ604ifn4T8EZwFT4/Ati1Uh19b6ViHAoBmW+sGToLJ1zGtdjXNKTp1HA/IU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088066; c=relaxed/simple; bh=fdBIcj3RbVlZfmB6uhAL7W8qOsSmbuWgs6wmgjCw6BI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gQ5VDzsPQ2N0ZSvXkYOZwrU8N+MwEFIQl9x0/ZbOLKsw5bW98hyiQsZJf+yYr/g1WUVZ63YIDVn6OHaPZj+uRR6pygpoIhPNy4zzln0A14DMlalOJKgJ0AsYJA5X3Ys2AYPVWeRo+IEm6PRUdthgg16dZ3aWR1H4PrYFvwcHs1Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4TKdhp3GlZz1xmdq; Wed, 24 Jan 2024 17:20:10 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id CD0D41A0170; Wed, 24 Jan 2024 17:20:47 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:26 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 09/11] dm-raid: add a new helper prepare_suspend() in md_personality Date: Wed, 24 Jan 2024 17:14:19 +0800 Message-ID: <20240124091421.1261579-10-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 10 +++++++--- drivers/md/md.h | 1 + 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index f1637cf88559..ede2e4574ee5 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3806,10 +3806,14 @@ static void raid_io_hints(struct dm_target *ti, struct queue_limits *limits) static void raid_presuspend(struct dm_target *ti) { struct raid_set *rs = ti->private; + struct mddev *mddev = &rs->md; - mddev_lock_nointr(&rs->md); - md_frozen_sync_thread(&rs->md); - mddev_unlock(&rs->md); + mddev_lock_nointr(mddev); + md_frozen_sync_thread(mddev); + mddev_unlock(mddev); + + if (mddev->pers && mddev->pers->prepare_suspend) + mddev->pers->prepare_suspend(mddev); } static void raid_presuspend_undo(struct dm_target *ti) diff --git a/drivers/md/md.h b/drivers/md/md.h index 437ab70ce79b..29b476ff3b9f 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -617,6 +617,7 @@ struct md_personality int (*start_reshape) (struct mddev *mddev); void (*finish_reshape) (struct mddev *mddev); void (*update_reshape_pos) (struct mddev *mddev); + void (*prepare_suspend) (struct mddev *mddev); /* quiesce suspends or resumes internal processing. * 1 - stop new actions and wait for action io to complete * 0 - return to normal behaviour From patchwork Wed Jan 24 09:14:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528845 Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B79F1802A; Wed, 24 Jan 2024 09:21:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.191 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088081; cv=none; b=WjpxeaOQdPrjRSMcfYaC5LkND6ER2szqJgJ4uPjsBqBzKESfpx7Fx1ew746ox4CXeFTIAsp21WwbNf7lS3RxtpVAGyQS8HRloa1/jOz3/au3idYHk3Y5/73FUj5r5rwFF2RstQ/O27KlUY1ekm1Koks6ZXfEaH3zGsHUtbT0Prg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088081; c=relaxed/simple; bh=OjTLjsUJhe++o4k4q8rXR78UKSsq+HIdp9Bt7hXT8n0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=pGIDQnXJtmbu7/DrAxjdEDL/3ONeGY9g+ltb0MlOtu7l0ny57ze84yjoImCFjzi6YWgtpvnHAcsuMZOot8SkqmlWy9H7JO6gYbJEAVgKRmGcDX0zSZCpzcvqkEQ68AjIN0UpAfAE8I573c5RbP1rL4yroNUnVGID9kf544zqcGo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.191 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4TKdd42v6nz1FJkT; Wed, 24 Jan 2024 17:16:56 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id DAFFD1A016F; Wed, 24 Jan 2024 17:21:02 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:27 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 10/11] md: export helper md_is_rdwr() Date: Wed, 24 Jan 2024 17:14:20 +0800 Message-ID: <20240124091421.1261579-11-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Signed-off-by: Yu Kuai --- drivers/md/md.c | 12 ------------ drivers/md/md.h | 12 ++++++++++++ 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 90cf31b53804..c803a805ea0d 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -99,18 +99,6 @@ static void mddev_detach(struct mddev *mddev); static void export_rdev(struct md_rdev *rdev, struct mddev *mddev); static void md_wakeup_thread_directly(struct md_thread __rcu *thread); -enum md_ro_state { - MD_RDWR, - MD_RDONLY, - MD_AUTO_READ, - MD_MAX_STATE -}; - -static bool md_is_rdwr(struct mddev *mddev) -{ - return (mddev->ro == MD_RDWR); -} - /* * Default number of read corrections we'll attempt on an rdev * before ejecting it from the array. We divide the read error diff --git a/drivers/md/md.h b/drivers/md/md.h index 29b476ff3b9f..98da86d38ba8 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -558,6 +558,18 @@ enum recovery_flags { MD_RESYNCING_REMOTE, /* remote node is running resync thread */ }; +enum md_ro_state { + MD_RDWR, + MD_RDONLY, + MD_AUTO_READ, + MD_MAX_STATE +}; + +static bool md_is_rdwr(struct mddev *mddev) +{ + return (mddev->ro == MD_RDWR); +} + static inline int __must_check mddev_lock(struct mddev *mddev) { return mutex_lock_interruptible(&mddev->reconfig_mutex); From patchwork Wed Jan 24 09:14:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13528846 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C47131B279; Wed, 24 Jan 2024 09:21:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088096; cv=none; b=AQt5M43r8I0zFze3sSdemqjrSVtyV5yBmMoYheLaOO7ZSNwyCmzqnNXTNhsRRQJ+FVsEPfve9QNlknsZtKr9Xw7RXd3c0Vas35GHalSDIKCEYvgghT02Vv4rhCR7m1ihsl8ZbHdRl8WNL/v9vz39Lm0g3YIGzIUh/ifPSicCuYo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706088096; c=relaxed/simple; bh=8VGBgzgDVrfglLO2JX9IIEqLDKT69ItgHmLnK4/ISbM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qIQ/nZoTaiq+gIS63oKRpzIA7RlOndYmRRrchILvrL5F/YmHo1WJQY/PC1h3UuQgbGuAh2nUM+pJHRTJagBtWvRvtiEgnKkB6ld9g/BHeIm2tUcaT8DUKacv+EDC8f2Br4VccLJjtz7z7Exi+aL3rV4SyZPDo62WqqnT0UUUhXs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4TKdjN43y5z1xmSd; Wed, 24 Jan 2024 17:20:40 +0800 (CST) Received: from kwepemm600009.china.huawei.com (unknown [7.193.23.164]) by mail.maildlp.com (Postfix) with ESMTPS id E8A651A0177; Wed, 24 Jan 2024 17:21:17 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 24 Jan 2024 17:18:28 +0800 From: Yu Kuai To: , , , , , , , , , , , CC: , , , , Subject: [PATCH v2 11/11] md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape Date: Wed, 24 Jan 2024 17:14:21 +0800 Message-ID: <20240124091421.1261579-12-yukuai3@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240124091421.1261579-1-yukuai3@huawei.com> References: <20240124091421.1261579-1-yukuai3@huawei.com> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm600009.china.huawei.com (7.193.23.164) For raid456, if reshape is still in progress, then IO across reshape position will wait for reshape to make progress. However, for dm-raid, in following cases reshape will never make progress hence IO will hang: 1) the array is read-only; 2) MD_RECOVERY_WAIT is set; 3) MD_RECOVERY_FROZEN is set; After commit c467e97f079f ("md/raid6: use valid sector values to determine if an I/O should wait on the reshape") fix the problem that IO across reshape position doesn't wait for reshape, the dm-raid test shell/lvconvert-raid-reshape.sh start to hang: [root@fedora ~]# cat /proc/979/stack [<0>] wait_woken+0x7d/0x90 [<0>] raid5_make_request+0x929/0x1d70 [raid456] [<0>] md_handle_request+0xc2/0x3b0 [md_mod] [<0>] raid_map+0x2c/0x50 [dm_raid] [<0>] __map_bio+0x251/0x380 [dm_mod] [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod] [<0>] __submit_bio+0xc2/0x1c0 [<0>] submit_bio_noacct_nocheck+0x17f/0x450 [<0>] submit_bio_noacct+0x2bc/0x780 [<0>] submit_bio+0x70/0xc0 [<0>] mpage_readahead+0x169/0x1f0 [<0>] blkdev_readahead+0x18/0x30 [<0>] read_pages+0x7c/0x3b0 [<0>] page_cache_ra_unbounded+0x1ab/0x280 [<0>] force_page_cache_ra+0x9e/0x130 [<0>] page_cache_sync_ra+0x3b/0x110 [<0>] filemap_get_pages+0x143/0xa30 [<0>] filemap_read+0xdc/0x4b0 [<0>] blkdev_read_iter+0x75/0x200 [<0>] vfs_read+0x272/0x460 [<0>] ksys_read+0x7a/0x170 [<0>] __x64_sys_read+0x1c/0x30 [<0>] do_syscall_64+0xc6/0x230 [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74 This is because reshape can't make progress. For md/raid, the problem doesn't exist because register new sync_thread doesn't rely on the IO to be done any more: 1) If array is read-only, it can switch to read-write by ioctl/sysfs; 2) md/raid never set MD_RECOVERY_WAIT; 3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold 'reconfig_mutex', hence it can be cleared and reshape can continue by sysfs api 'sync_action'. However, I'm not sure yet how to avoid the problem in dm-raid yet. This patch detect the above 3 cases in dm_suspend(), and fail those IO directly. If user really meet the IO error, then it means they're reading the wrong data before c467e97f079f. And it's safe to read/write the array after reshape make progress successfully. Signed-off-by: Yu Kuai --- drivers/md/md.h | 2 +- drivers/md/raid5.c | 32 +++++++++++++++++++++++++++++++- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/drivers/md/md.h b/drivers/md/md.h index 98da86d38ba8..8e81f9e2fb20 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -565,7 +565,7 @@ enum md_ro_state { MD_MAX_STATE }; -static bool md_is_rdwr(struct mddev *mddev) +static inline bool md_is_rdwr(struct mddev *mddev) { return (mddev->ro == MD_RDWR); } diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 6a7a32f7fb91..812d7ec64da5 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -5915,6 +5915,13 @@ static int add_all_stripe_bios(struct r5conf *conf, return ret; } +static bool reshape_disabled(struct mddev *mddev) +{ + return !md_is_rdwr(mddev) || + test_bit(MD_RECOVERY_WAIT, &mddev->recovery) || + test_bit(MD_RECOVERY_FROZEN, &mddev->recovery); +} + static enum stripe_result make_stripe_request(struct mddev *mddev, struct r5conf *conf, struct stripe_request_ctx *ctx, sector_t logical_sector, struct bio *bi) @@ -5946,7 +5953,8 @@ static enum stripe_result make_stripe_request(struct mddev *mddev, if (ahead_of_reshape(mddev, logical_sector, conf->reshape_safe)) { spin_unlock_irq(&conf->device_lock); - return STRIPE_SCHEDULE_AND_RETRY; + ret = STRIPE_SCHEDULE_AND_RETRY; + goto out; } } spin_unlock_irq(&conf->device_lock); @@ -6025,6 +6033,13 @@ static enum stripe_result make_stripe_request(struct mddev *mddev, out_release: raid5_release_stripe(sh); +out: + if (ret == STRIPE_SCHEDULE_AND_RETRY && !mddev->gendisk && + reshape_disabled(mddev)) { + bi->bi_status = BLK_STS_IOERR; + ret = STRIPE_FAIL; + pr_err("dm-raid456: io failed across reshape position while reshape can't make progress"); + } return ret; } @@ -8909,6 +8924,18 @@ static int raid5_start(struct mddev *mddev) return r5l_start(conf->log); } +/* + * This is only used for dm-raid456, caller already frozen sync_thread, hence + * if rehsape is still in progress, io that is waiting for reshape can never be + * done now, hence wake up and handle those IO. + */ +static void raid5_prepare_suspend(struct mddev *mddev) +{ + struct r5conf *conf = mddev->private; + + wake_up(&conf->wait_for_overlap); +} + static struct md_personality raid6_personality = { .name = "raid6", @@ -8932,6 +8959,7 @@ static struct md_personality raid6_personality = .quiesce = raid5_quiesce, .takeover = raid6_takeover, .change_consistency_policy = raid5_change_consistency_policy, + .prepare_suspend = raid5_prepare_suspend, }; static struct md_personality raid5_personality = { @@ -8956,6 +8984,7 @@ static struct md_personality raid5_personality = .quiesce = raid5_quiesce, .takeover = raid5_takeover, .change_consistency_policy = raid5_change_consistency_policy, + .prepare_suspend = raid5_prepare_suspend, }; static struct md_personality raid4_personality = @@ -8981,6 +9010,7 @@ static struct md_personality raid4_personality = .quiesce = raid5_quiesce, .takeover = raid4_takeover, .change_consistency_policy = raid5_change_consistency_policy, + .prepare_suspend = raid5_prepare_suspend, }; static int __init raid5_init(void)