From patchwork Mon Apr 10 11:35:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13206306 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A370BC77B70 for ; Mon, 10 Apr 2023 11:37:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229961AbjDJLhF (ORCPT ); Mon, 10 Apr 2023 07:37:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229919AbjDJLg7 (ORCPT ); Mon, 10 Apr 2023 07:36:59 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7503349D6; Mon, 10 Apr 2023 04:36:57 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Pw6Pw45L9z4f3mVh; Mon, 10 Apr 2023 19:36:52 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHvbDT9DNk0tNvHA--.50875S5; Mon, 10 Apr 2023 19:36:54 +0800 (CST) From: Yu Kuai To: logang@deltatee.com, song@kernel.org Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v5 1/6] md: pass a md_thread pointer to md_register_thread() Date: Mon, 10 Apr 2023 19:35:54 +0800 Message-Id: <20230410113559.1610455-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230410113559.1610455-1-yukuai1@huaweicloud.com> References: <20230410113559.1610455-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHvbDT9DNk0tNvHA--.50875S5 X-Coremail-Antispam: 1UD129KBjvJXoW3JFy3uFykCryfGF15KF1ftFb_yoWfAry8pa yxWFyavr48ArW3ZrWDAa4Dua45uw10gFWjyry3C34fA3ZxK3y3JFyY9FyUJryDAa4rAF43 tw15KFW8uF4kKr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9m14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrwCFx2 IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v2 6r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67 AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IY s7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr 0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUqAp5UUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai Prepare to protect md_thread with rcu, there are no functional changes. Signed-off-by: Yu Kuai --- drivers/md/md-cluster.c | 11 +++++------ drivers/md/md-multipath.c | 6 +++--- drivers/md/md.c | 27 ++++++++++++++------------- drivers/md/md.h | 7 +++---- drivers/md/raid1.c | 5 ++--- drivers/md/raid10.c | 15 ++++++--------- drivers/md/raid5-cache.c | 5 ++--- drivers/md/raid5.c | 15 ++++++--------- 8 files changed, 41 insertions(+), 50 deletions(-) diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c index 10e0c5381d01..c19e29cb73bf 100644 --- a/drivers/md/md-cluster.c +++ b/drivers/md/md-cluster.c @@ -362,9 +362,8 @@ static void __recover_slot(struct mddev *mddev, int slot) set_bit(slot, &cinfo->recovery_map); if (!cinfo->recovery_thread) { - cinfo->recovery_thread = md_register_thread(recover_bitmaps, - mddev, "recover"); - if (!cinfo->recovery_thread) { + if (md_register_thread(&cinfo->recovery_thread, recover_bitmaps, + mddev, "recover")) { pr_warn("md-cluster: Could not create recovery thread\n"); return; } @@ -888,9 +887,9 @@ static int join(struct mddev *mddev, int nodes) goto err; } /* Initiate the communication resources */ - ret = -ENOMEM; - cinfo->recv_thread = md_register_thread(recv_daemon, mddev, "cluster_recv"); - if (!cinfo->recv_thread) { + ret = md_register_thread(&cinfo->recv_thread, recv_daemon, mddev, + "cluster_recv"); + if (ret) { pr_err("md-cluster: cannot allocate memory for recv_thread!\n"); goto err; } diff --git a/drivers/md/md-multipath.c b/drivers/md/md-multipath.c index 66edf5e72bd6..ceec9e4b2a60 100644 --- a/drivers/md/md-multipath.c +++ b/drivers/md/md-multipath.c @@ -400,9 +400,9 @@ static int multipath_run (struct mddev *mddev) if (ret) goto out_free_conf; - mddev->thread = md_register_thread(multipathd, mddev, - "multipath"); - if (!mddev->thread) + ret = md_register_thread(&mddev->thread, multipathd, mddev, + "multipath"); + if (ret) goto out_free_conf; pr_info("multipath: array %s active with %d out of %d IO paths\n", diff --git a/drivers/md/md.c b/drivers/md/md.c index 9bc05f451d42..1459c2cfb0dd 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -7896,29 +7896,32 @@ void md_wakeup_thread(struct md_thread *thread) } EXPORT_SYMBOL(md_wakeup_thread); -struct md_thread *md_register_thread(void (*run) (struct md_thread *), - struct mddev *mddev, const char *name) +int md_register_thread(struct md_thread **threadp, + void (*run)(struct md_thread *), + struct mddev *mddev, const char *name) { struct md_thread *thread; thread = kzalloc(sizeof(struct md_thread), GFP_KERNEL); if (!thread) - return NULL; + return -ENOMEM; init_waitqueue_head(&thread->wqueue); thread->run = run; thread->mddev = mddev; thread->timeout = MAX_SCHEDULE_TIMEOUT; - thread->tsk = kthread_run(md_thread, thread, - "%s_%s", - mdname(thread->mddev), - name); + thread->tsk = kthread_run(md_thread, thread, "%s_%s", + mdname(thread->mddev), name); if (IS_ERR(thread->tsk)) { + int err = PTR_ERR(thread->tsk); + kfree(thread); - return NULL; + return err; } - return thread; + + *threadp = thread; + return 0; } EXPORT_SYMBOL(md_register_thread); @@ -9199,10 +9202,8 @@ static void md_start_sync(struct work_struct *ws) { struct mddev *mddev = container_of(ws, struct mddev, del_work); - mddev->sync_thread = md_register_thread(md_do_sync, - mddev, - "resync"); - if (!mddev->sync_thread) { + if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev, + "resync")) { pr_warn("%s: could not start resync thread...\n", mdname(mddev)); /* leave the spares where they are, it shouldn't hurt */ diff --git a/drivers/md/md.h b/drivers/md/md.h index e148e3c83b0d..7af45b8432e9 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -730,10 +730,9 @@ extern int register_md_cluster_operations(struct md_cluster_operations *ops, extern int unregister_md_cluster_operations(void); extern int md_setup_cluster(struct mddev *mddev, int nodes); extern void md_cluster_stop(struct mddev *mddev); -extern struct md_thread *md_register_thread( - void (*run)(struct md_thread *thread), - struct mddev *mddev, - const char *name); +extern int md_register_thread(struct md_thread **threadp, + void (*run)(struct md_thread *thread), + struct mddev *mddev, const char *name); extern void md_unregister_thread(struct md_thread **threadp); extern void md_wakeup_thread(struct md_thread *thread); extern void md_check_recovery(struct mddev *mddev); diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 68a9e2d9985b..1217c1db0a40 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -3083,9 +3083,8 @@ static struct r1conf *setup_conf(struct mddev *mddev) } } - err = -ENOMEM; - conf->thread = md_register_thread(raid1d, mddev, "raid1"); - if (!conf->thread) + err = md_register_thread(&conf->thread, raid1d, mddev, "raid1"); + if (err) goto abort; return conf; diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6c66357f92f5..0171ba4f19b0 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -4077,9 +4077,8 @@ static struct r10conf *setup_conf(struct mddev *mddev) init_waitqueue_head(&conf->wait_barrier); atomic_set(&conf->nr_pending, 0); - err = -ENOMEM; - conf->thread = md_register_thread(raid10d, mddev, "raid10"); - if (!conf->thread) + err = md_register_thread(&conf->thread, raid10d, mddev, "raid10"); + if (err) goto out; conf->mddev = mddev; @@ -4273,9 +4272,8 @@ static int raid10_run(struct mddev *mddev) clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - mddev->sync_thread = md_register_thread(md_do_sync, mddev, - "reshape"); - if (!mddev->sync_thread) + if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev, + "reshape")) goto out_free_conf; } @@ -4686,9 +4684,8 @@ static int raid10_start_reshape(struct mddev *mddev) set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - mddev->sync_thread = md_register_thread(md_do_sync, mddev, - "reshape"); - if (!mddev->sync_thread) { + if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev, + "reshape")) { ret = -EAGAIN; goto abort; } diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c index 46182b955aef..0464d4d551fc 100644 --- a/drivers/md/raid5-cache.c +++ b/drivers/md/raid5-cache.c @@ -3121,9 +3121,8 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev) spin_lock_init(&log->tree_lock); INIT_RADIX_TREE(&log->big_stripe_tree, GFP_NOWAIT | __GFP_NOWARN); - log->reclaim_thread = md_register_thread(r5l_reclaim_thread, - log->rdev->mddev, "reclaim"); - if (!log->reclaim_thread) + if (md_register_thread(&log->reclaim_thread, r5l_reclaim_thread, + log->rdev->mddev, "reclaim")) goto reclaim_thread; log->reclaim_thread->timeout = R5C_RECLAIM_WAKEUP_INTERVAL; diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 7b820b81d8c2..04b1093195d0 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7665,11 +7665,10 @@ static struct r5conf *setup_conf(struct mddev *mddev) } sprintf(pers_name, "raid%d", mddev->new_level); - conf->thread = md_register_thread(raid5d, mddev, pers_name); - if (!conf->thread) { + ret = md_register_thread(&conf->thread, raid5d, mddev, pers_name); + if (ret) { pr_warn("md/raid:%s: couldn't allocate thread.\n", mdname(mddev)); - ret = -ENOMEM; goto abort; } @@ -7989,9 +7988,8 @@ static int raid5_run(struct mddev *mddev) clear_bit(MD_RECOVERY_CHECK, &mddev->recovery); set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - mddev->sync_thread = md_register_thread(md_do_sync, mddev, - "reshape"); - if (!mddev->sync_thread) + if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev, + "reshape")) goto abort; } @@ -8567,9 +8565,8 @@ static int raid5_start_reshape(struct mddev *mddev) clear_bit(MD_RECOVERY_DONE, &mddev->recovery); set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); - mddev->sync_thread = md_register_thread(md_do_sync, mddev, - "reshape"); - if (!mddev->sync_thread) { + if (md_register_thread(&mddev->sync_thread, md_do_sync, mddev, + "reshape")) { mddev->recovery = 0; spin_lock_irq(&conf->device_lock); write_seqcount_begin(&conf->gen_lock); From patchwork Mon Apr 10 11:35:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13206301 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 475DEC77B61 for ; Mon, 10 Apr 2023 11:37:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229935AbjDJLhA (ORCPT ); Mon, 10 Apr 2023 07:37:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229739AbjDJLg6 (ORCPT ); Mon, 10 Apr 2023 07:36:58 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74C7349C8; Mon, 10 Apr 2023 04:36:57 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Pw6Pw6nzTz4f3mW6; Mon, 10 Apr 2023 19:36:52 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHvbDT9DNk0tNvHA--.50875S6; Mon, 10 Apr 2023 19:36:54 +0800 (CST) From: Yu Kuai To: logang@deltatee.com, song@kernel.org Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v5 2/6] md: factor out a helper to wake up md_thread directly Date: Mon, 10 Apr 2023 19:35:55 +0800 Message-Id: <20230410113559.1610455-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230410113559.1610455-1-yukuai1@huaweicloud.com> References: <20230410113559.1610455-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHvbDT9DNk0tNvHA--.50875S6 X-Coremail-Antispam: 1UD129KBjvJXoWxXFW8CF17JrW5Xr1kKF4xtFb_yoW5CFW3p3 y8tF15Wr48ZFZ8ZFZrJa4DGa4rZr10qFy7try7Cw4rJw1rKw43tFyS9Fyjya4DAFyrAw45 Zw15tFWrurZ2kF7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9m14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrwCFx2 IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v2 6r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67 AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IY s7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr 0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUc6pPUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai md_wakeup_thread() can't wakeup md_thread->tsk if md_thread->run is still in progress, and in some cases md_thread->tsk need to be woke up directly, like md_set_readonly() and do_md_stop(). Commit 9dfbdafda3b3 ("md: unlock mddev before reap sync_thread in action_store") introduce a new scenario where unregister sync_thread is not protected by 'reconfig_mutex', this can cause null-ptr-deference in theroy: t1: md_set_readonly t2: action_store md_unregister_thread // 'reconfig_mutex' is not held // 'reconfig_mutex' is held by caller if (mddev->sync_thread) thread = *threadp *threadp = NULL wake_up_process(mddev->sync_thread->tsk) // null-ptr-deference This patch factor out a helper to wake up md_thread directly, so that 'sync_thread' won't be accessed multiple times from the reader side. And perhaps this helper will be used later to fix action_store(). This patch also prepare to protect md_thread with rcu. Signed-off-by: Yu Kuai --- drivers/md/md.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 1459c2cfb0dd..139c7b0202e3 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -92,6 +92,7 @@ static struct workqueue_struct *md_rdev_misc_wq; static int remove_and_add_spares(struct mddev *mddev, struct md_rdev *this); static void mddev_detach(struct mddev *mddev); +static void md_wakeup_thread_directly(struct md_thread *thread); enum md_ro_state { MD_RDWR, @@ -6269,10 +6270,12 @@ static int md_set_readonly(struct mddev *mddev, struct block_device *bdev) } if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) set_bit(MD_RECOVERY_INTR, &mddev->recovery); - if (mddev->sync_thread) - /* Thread might be blocked waiting for metadata update - * which will now never happen */ - wake_up_process(mddev->sync_thread->tsk); + + /* + * Thread might be blocked waiting for metadata update which will now + * never happen + */ + md_wakeup_thread_directly(mddev->sync_thread); if (mddev->external && test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)) return -EBUSY; @@ -6333,10 +6336,12 @@ static int do_md_stop(struct mddev *mddev, int mode, } if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) set_bit(MD_RECOVERY_INTR, &mddev->recovery); - if (mddev->sync_thread) - /* Thread might be blocked waiting for metadata update - * which will now never happen */ - wake_up_process(mddev->sync_thread->tsk); + + /* + * Thread might be blocked waiting for metadata update which will now + * never happen + */ + md_wakeup_thread_directly(mddev->sync_thread); mddev_unlock(mddev); wait_event(resync_wait, (mddev->sync_thread == NULL && @@ -7886,6 +7891,12 @@ static int md_thread(void *arg) return 0; } +static void md_wakeup_thread_directly(struct md_thread *thread) +{ + if (thread) + wake_up_process(thread->tsk); +} + void md_wakeup_thread(struct md_thread *thread) { if (thread) { From patchwork Mon Apr 10 11:35:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13206304 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0C12C77B61 for ; Mon, 10 Apr 2023 11:37:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229955AbjDJLhE (ORCPT ); Mon, 10 Apr 2023 07:37:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229771AbjDJLg7 (ORCPT ); Mon, 10 Apr 2023 07:36:59 -0400 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E74A549ED; Mon, 10 Apr 2023 04:36:57 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Pw6Px0NL9z4f3qsl; Mon, 10 Apr 2023 19:36:53 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHvbDT9DNk0tNvHA--.50875S7; Mon, 10 Apr 2023 19:36:54 +0800 (CST) From: Yu Kuai To: logang@deltatee.com, song@kernel.org Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v5 3/6] dm-raid: remove useless checking in raid_message() Date: Mon, 10 Apr 2023 19:35:56 +0800 Message-Id: <20230410113559.1610455-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230410113559.1610455-1-yukuai1@huaweicloud.com> References: <20230410113559.1610455-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHvbDT9DNk0tNvHA--.50875S7 X-Coremail-Antispam: 1UD129KBjvdXoWrZrW7trWxXry3Gr1kGF4rXwb_yoWfuFcEgF s5Xr9rXr17u34fA3W2vw40vr90ywn5ur1kWF1rtFyayF18KryrXryru3s8CwsrZFW7CryU CrWUKr1fCrn5CjkaLaAFLSUrUUUUUb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUb-kFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWwA2048vs2IY02 0Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVW5JVW7JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMxC20s 026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_ JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14 v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xva j40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JV W8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbJ73DUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai md_wakeup_thread() handle the case that pass in md_thread is NULL, there is no need to check this. Prepare to protect md_thread with rcu. Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 2dfd51509647..9db45f3508e3 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3750,11 +3750,11 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv, * canceling read-auto mode */ mddev->ro = 0; - if (!mddev->suspended && mddev->sync_thread) + if (!mddev->suspended) md_wakeup_thread(mddev->sync_thread); } set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - if (!mddev->suspended && mddev->thread) + if (!mddev->suspended) md_wakeup_thread(mddev->thread); return 0; From patchwork Mon Apr 10 11:35:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13206303 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EECDAC77B61 for ; Mon, 10 Apr 2023 11:37:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229946AbjDJLhD (ORCPT ); Mon, 10 Apr 2023 07:37:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229744AbjDJLg7 (ORCPT ); Mon, 10 Apr 2023 07:36:59 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1E3B49D8; Mon, 10 Apr 2023 04:36:57 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Pw6Px4yQxz4f3r6m; Mon, 10 Apr 2023 19:36:53 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHvbDT9DNk0tNvHA--.50875S8; Mon, 10 Apr 2023 19:36:55 +0800 (CST) From: Yu Kuai To: logang@deltatee.com, song@kernel.org Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v5 4/6] md/bitmap: always wake up md_thread in timeout_store Date: Mon, 10 Apr 2023 19:35:57 +0800 Message-Id: <20230410113559.1610455-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230410113559.1610455-1-yukuai1@huaweicloud.com> References: <20230410113559.1610455-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHvbDT9DNk0tNvHA--.50875S8 X-Coremail-Antispam: 1UD129KBjvdXoWrZrW3Wr13ArW7tF4fGr4rZrb_yoWkurcEga nY93s3JrW7CFy7Kw4ayw40vryUtF45u3WkXFn2grWIvwnrK395Wry0vw1qv3WxCFW3Cryr Cry8Wr48Zw15ujkaLaAFLSUrUUUUUb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbTxFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAVCq3wA2048vs2 IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28E F7xvwVC0I7IYx2IY67AKxVW5JVW7JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr 1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48JMx C20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAF wI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20x vE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v2 0xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxV W8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai md_wakeup_thread() can handle the case that pass in md_thread is NULL, the only difference is that md_wakeup_thread() will be called when current timeout is 'MAX_SCHEDULE_TIMEOUT', this should not matter because timeout_store() is not hot path, and the daemon process is woke up more than demand from other context already. This patch prepare to factor out a helper to set timeout. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index e7cc6ba1b657..014e5c8a4fe0 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -2452,11 +2452,11 @@ timeout_store(struct mddev *mddev, const char *buf, size_t len) * the bitmap is all clean and we don't need to * adjust the timeout right now */ - if (mddev->thread->timeout < MAX_SCHEDULE_TIMEOUT) { + if (mddev->thread->timeout < MAX_SCHEDULE_TIMEOUT) mddev->thread->timeout = timeout; - md_wakeup_thread(mddev->thread); - } } + + md_wakeup_thread(mddev->thread); return len; } From patchwork Mon Apr 10 11:35:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13206305 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87E70C77B73 for ; Mon, 10 Apr 2023 11:37:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229966AbjDJLhG (ORCPT ); Mon, 10 Apr 2023 07:37:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229920AbjDJLg7 (ORCPT ); Mon, 10 Apr 2023 07:36:59 -0400 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29FD54C22; Mon, 10 Apr 2023 04:36:58 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Pw6Px5lBZz4f3yqs; Mon, 10 Apr 2023 19:36:53 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHvbDT9DNk0tNvHA--.50875S9; Mon, 10 Apr 2023 19:36:55 +0800 (CST) From: Yu Kuai To: logang@deltatee.com, song@kernel.org Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v5 5/6] md/bitmap: factor out a helper to set timeout Date: Mon, 10 Apr 2023 19:35:58 +0800 Message-Id: <20230410113559.1610455-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230410113559.1610455-1-yukuai1@huaweicloud.com> References: <20230410113559.1610455-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHvbDT9DNk0tNvHA--.50875S9 X-Coremail-Antispam: 1UD129KBjvJXoWxJF1xXrWfJryrKF4UKw18Grg_yoWrXFyUp3 yfKas0yF18XrWfXw4xJaykCF1rXr1vqFZrtryxX34rCwn8Gws3tFyrWa4Dt3WDC34rAFs0 q3W5GrW8CFyUWr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUU9C14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxGrw CFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE 14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2 IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxK x2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI 0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQSdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai Register/unregister 'mddev->thread' are both under 'reconfig_mutex', however, some context didn't hold the mutex to access mddev->thread, which can cause null-ptr-deference: 1) md_bitmap_daemon_work() can be called from md_check_recovery() where 'reconfig_mutex' is not held, deference 'mddev->thread' might cause null-ptr-deference, because md_unregister_thread() reset the pointer before stopping the thread. 2) timeout_store() access 'mddev->thread' multiple times, null-ptr-deference can be triggered if 'mddev->thread' is reset in the middle. This patch factor out a helper to set timeout, the new helper always check if 'mddev->thread' is null first, so that problem 1 can be fixed. Now that this helper only access 'mddev->thread' once, but it's possible that 'mddev->thread' is freed while this helper is still in progress, hence the problem is not fixed yet. Follow up patches will fix this by protecting md_thread with rcu. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index 014e5c8a4fe0..29fd41ef55a6 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -1218,11 +1218,22 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap, sector_t offset, sector_t *blocks, int create); +static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout, + bool force) +{ + struct md_thread *thread = mddev->thread; + + if (!thread) + return; + + if (force || thread->timeout < MAX_SCHEDULE_TIMEOUT) + thread->timeout = timeout; +} + /* * bitmap daemon -- periodically wakes up to clean bits and flush pages * out to disk */ - void md_bitmap_daemon_work(struct mddev *mddev) { struct bitmap *bitmap; @@ -1246,7 +1257,7 @@ void md_bitmap_daemon_work(struct mddev *mddev) bitmap->daemon_lastrun = jiffies; if (bitmap->allclean) { - mddev->thread->timeout = MAX_SCHEDULE_TIMEOUT; + mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true); goto done; } bitmap->allclean = 1; @@ -1343,8 +1354,7 @@ void md_bitmap_daemon_work(struct mddev *mddev) done: if (bitmap->allclean == 0) - mddev->thread->timeout = - mddev->bitmap_info.daemon_sleep; + mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true); mutex_unlock(&mddev->bitmap_info.mutex); } @@ -1797,8 +1807,7 @@ void md_bitmap_destroy(struct mddev *mddev) mddev->bitmap = NULL; /* disconnect from the md device */ spin_unlock(&mddev->lock); mutex_unlock(&mddev->bitmap_info.mutex); - if (mddev->thread) - mddev->thread->timeout = MAX_SCHEDULE_TIMEOUT; + mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true); md_bitmap_free(bitmap); } @@ -1941,7 +1950,7 @@ int md_bitmap_load(struct mddev *mddev) /* Kick recovery in case any bits were set */ set_bit(MD_RECOVERY_NEEDED, &bitmap->mddev->recovery); - mddev->thread->timeout = mddev->bitmap_info.daemon_sleep; + mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true); md_wakeup_thread(mddev->thread); md_bitmap_update_sb(bitmap); @@ -2446,17 +2455,11 @@ timeout_store(struct mddev *mddev, const char *buf, size_t len) timeout = MAX_SCHEDULE_TIMEOUT-1; if (timeout < 1) timeout = 1; - mddev->bitmap_info.daemon_sleep = timeout; - if (mddev->thread) { - /* if thread->timeout is MAX_SCHEDULE_TIMEOUT, then - * the bitmap is all clean and we don't need to - * adjust the timeout right now - */ - if (mddev->thread->timeout < MAX_SCHEDULE_TIMEOUT) - mddev->thread->timeout = timeout; - } + mddev->bitmap_info.daemon_sleep = timeout; + mddev_set_timeout(mddev, timeout, false); md_wakeup_thread(mddev->thread); + return len; } From patchwork Mon Apr 10 11:35:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13206307 X-Patchwork-Delegate: song@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7847FC77B61 for ; Mon, 10 Apr 2023 11:37:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229972AbjDJLhH (ORCPT ); Mon, 10 Apr 2023 07:37:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229927AbjDJLhA (ORCPT ); Mon, 10 Apr 2023 07:37:00 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B3144C26; Mon, 10 Apr 2023 04:36:58 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Pw6Py3HJtz4f3lXZ; Mon, 10 Apr 2023 19:36:54 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHvbDT9DNk0tNvHA--.50875S10; Mon, 10 Apr 2023 19:36:55 +0800 (CST) From: Yu Kuai To: logang@deltatee.com, song@kernel.org Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v5 6/6] md: protect md_thread with rcu Date: Mon, 10 Apr 2023 19:35:59 +0800 Message-Id: <20230410113559.1610455-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230410113559.1610455-1-yukuai1@huaweicloud.com> References: <20230410113559.1610455-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHvbDT9DNk0tNvHA--.50875S10 X-Coremail-Antispam: 1UD129KBjvAXoWfJFWkWrWkWr13KF1xKF1xZrb_yoW8Xr1fXo Z3Cw13Zr18GF1rXFyUJrn3tFsxX34DG3yfta15uFs8WFnFv395Zr13XF43JF1jqFnxWr47 Zr9rXw4IgFW8tw48n29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUYL7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l 84ACjcxK6xIIjxv20xvE14v26ryj6F1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJV WxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE 3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2I x0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8 JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1l42xK82IYc2Ij64vIr4 1l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK 67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI 8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26F4j6r4UJwCI42IY6xAIw20E Y4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267 AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZX7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-raid@vger.kernel.org From: Yu Kuai Our test reports a uaf for 'mddev->sync_thread': T1 T2 md_start_sync md_register_thread // mddev->sync_thread is set raid1d md_check_recovery md_reap_sync_thread md_unregister_thread kfree md_wakeup_thread wake_up ->sync_thread was freed Root cause is that there is a small windown between register thread and wake up thread, where the thread can be freed concurrently. Currently, a global spinlock 'pers_lock' is borrowed to protect 'mddev->thread', this problem can be fixed likewise, however, there might be similar problem elsewhere, and use a global lock for all the cases is not good. This patch protect md_thread with rcu. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 29 ++++++++++++----- drivers/md/md.c | 68 +++++++++++++++++++--------------------- drivers/md/md.h | 10 +++--- drivers/md/raid1.c | 4 +-- drivers/md/raid1.h | 2 +- drivers/md/raid10.c | 10 ++++-- drivers/md/raid10.h | 2 +- drivers/md/raid5-cache.c | 15 +++++---- drivers/md/raid5.c | 4 +-- drivers/md/raid5.h | 2 +- 10 files changed, 81 insertions(+), 65 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index 29fd41ef55a6..b9baeea5605e 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -1219,15 +1219,27 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap, int create); static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout, - bool force) + bool force, bool protected) { - struct md_thread *thread = mddev->thread; + struct md_thread *thread; + + if (!protected) { + rcu_read_lock(); + thread = rcu_dereference(mddev->thread); + } else { + thread = rcu_dereference_protected(mddev->thread, + lockdep_is_held(&mddev->reconfig_mutex)); + } if (!thread) - return; + goto out; if (force || thread->timeout < MAX_SCHEDULE_TIMEOUT) thread->timeout = timeout; + +out: + if (!protected) + rcu_read_unlock(); } /* @@ -1257,7 +1269,7 @@ void md_bitmap_daemon_work(struct mddev *mddev) bitmap->daemon_lastrun = jiffies; if (bitmap->allclean) { - mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true); + mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true, false); goto done; } bitmap->allclean = 1; @@ -1354,7 +1366,8 @@ void md_bitmap_daemon_work(struct mddev *mddev) done: if (bitmap->allclean == 0) - mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true); + mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true, + false); mutex_unlock(&mddev->bitmap_info.mutex); } @@ -1807,7 +1820,7 @@ void md_bitmap_destroy(struct mddev *mddev) mddev->bitmap = NULL; /* disconnect from the md device */ spin_unlock(&mddev->lock); mutex_unlock(&mddev->bitmap_info.mutex); - mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true); + mddev_set_timeout(mddev, MAX_SCHEDULE_TIMEOUT, true, true); md_bitmap_free(bitmap); } @@ -1950,7 +1963,7 @@ int md_bitmap_load(struct mddev *mddev) /* Kick recovery in case any bits were set */ set_bit(MD_RECOVERY_NEEDED, &bitmap->mddev->recovery); - mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true); + mddev_set_timeout(mddev, mddev->bitmap_info.daemon_sleep, true, true); md_wakeup_thread(mddev->thread); md_bitmap_update_sb(bitmap); @@ -2457,7 +2470,7 @@ timeout_store(struct mddev *mddev, const char *buf, size_t len) timeout = 1; mddev->bitmap_info.daemon_sleep = timeout; - mddev_set_timeout(mddev, timeout, false); + mddev_set_timeout(mddev, timeout, false, false); md_wakeup_thread(mddev->thread); return len; diff --git a/drivers/md/md.c b/drivers/md/md.c index 139c7b0202e3..3afece35f0ee 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -70,11 +70,7 @@ #include "md-bitmap.h" #include "md-cluster.h" -/* pers_list is a list of registered personalities protected - * by pers_lock. - * pers_lock does extra service to protect accesses to - * mddev->thread when the mutex cannot be held. - */ +/* pers_list is a list of registered personalities protected by pers_lock. */ static LIST_HEAD(pers_list); static DEFINE_SPINLOCK(pers_lock); @@ -92,7 +88,7 @@ static struct workqueue_struct *md_rdev_misc_wq; static int remove_and_add_spares(struct mddev *mddev, struct md_rdev *this); static void mddev_detach(struct mddev *mddev); -static void md_wakeup_thread_directly(struct md_thread *thread); +static void md_wakeup_thread_directly(struct md_thread __rcu *thread); enum md_ro_state { MD_RDWR, @@ -458,8 +454,10 @@ static void md_submit_bio(struct bio *bio) */ void mddev_suspend(struct mddev *mddev) { - WARN_ON_ONCE(mddev->thread && current == mddev->thread->tsk); - lockdep_assert_held(&mddev->reconfig_mutex); + struct md_thread *thread = rcu_dereference_protected(mddev->thread, + lockdep_is_held(&mddev->reconfig_mutex)); + + WARN_ON_ONCE(thread && current == thread->tsk); if (mddev->suspended++) return; wake_up(&mddev->sb_wait); @@ -801,13 +799,8 @@ void mddev_unlock(struct mddev *mddev) } else mutex_unlock(&mddev->reconfig_mutex); - /* As we've dropped the mutex we need a spinlock to - * make sure the thread doesn't disappear - */ - spin_lock(&pers_lock); md_wakeup_thread(mddev->thread); wake_up(&mddev->sb_wait); - spin_unlock(&pers_lock); } EXPORT_SYMBOL_GPL(mddev_unlock); @@ -7891,23 +7884,33 @@ static int md_thread(void *arg) return 0; } -static void md_wakeup_thread_directly(struct md_thread *thread) +static void md_wakeup_thread_directly(struct md_thread __rcu *thread) { - if (thread) - wake_up_process(thread->tsk); + struct md_thread *t; + + rcu_read_lock(); + t = rcu_dereference(thread); + if (t) + wake_up_process(t->tsk); + rcu_read_unlock(); } -void md_wakeup_thread(struct md_thread *thread) +void md_wakeup_thread(struct md_thread __rcu *thread) { - if (thread) { - pr_debug("md: waking up MD thread %s.\n", thread->tsk->comm); - set_bit(THREAD_WAKEUP, &thread->flags); - wake_up(&thread->wqueue); + struct md_thread *t; + + rcu_read_lock(); + t = rcu_dereference(thread); + if (t) { + pr_debug("md: waking up MD thread %s.\n", t->tsk->comm); + set_bit(THREAD_WAKEUP, &t->flags); + wake_up(&t->wqueue); } + rcu_read_unlock(); } EXPORT_SYMBOL(md_wakeup_thread); -int md_register_thread(struct md_thread **threadp, +int md_register_thread(struct md_thread __rcu **threadp, void (*run)(struct md_thread *), struct mddev *mddev, const char *name) { @@ -7931,27 +7934,20 @@ int md_register_thread(struct md_thread **threadp, return err; } - *threadp = thread; + rcu_assign_pointer(*threadp, thread); return 0; } EXPORT_SYMBOL(md_register_thread); -void md_unregister_thread(struct md_thread **threadp) +void md_unregister_thread(struct md_thread __rcu **threadp) { - struct md_thread *thread; + struct md_thread *thread = rcu_dereference_protected(*threadp, true); - /* - * Locking ensures that mddev_unlock does not wake_up a - * non-existent thread - */ - spin_lock(&pers_lock); - thread = *threadp; - if (!thread) { - spin_unlock(&pers_lock); + if (!thread) return; - } - *threadp = NULL; - spin_unlock(&pers_lock); + + rcu_assign_pointer(*threadp, NULL); + synchronize_rcu(); pr_debug("interrupting MD-thread pid %d\n", task_pid_nr(thread->tsk)); kthread_stop(thread->tsk); diff --git a/drivers/md/md.h b/drivers/md/md.h index 7af45b8432e9..621a3f183afd 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -367,8 +367,8 @@ struct mddev { int new_chunk_sectors; int reshape_backwards; - struct md_thread *thread; /* management thread */ - struct md_thread *sync_thread; /* doing resync or reconstruct */ + struct md_thread __rcu *thread; /* management thread */ + struct md_thread __rcu *sync_thread; /* doing resync or reconstruct */ /* 'last_sync_action' is initialized to "none". It is set when a * sync operation (i.e "data-check", "requested-resync", "resync", @@ -730,11 +730,11 @@ extern int register_md_cluster_operations(struct md_cluster_operations *ops, extern int unregister_md_cluster_operations(void); extern int md_setup_cluster(struct mddev *mddev, int nodes); extern void md_cluster_stop(struct mddev *mddev); -extern int md_register_thread(struct md_thread **threadp, +extern int md_register_thread(struct md_thread __rcu **threadp, void (*run)(struct md_thread *thread), struct mddev *mddev, const char *name); -extern void md_unregister_thread(struct md_thread **threadp); -extern void md_wakeup_thread(struct md_thread *thread); +extern void md_unregister_thread(struct md_thread __rcu **threadp); +extern void md_wakeup_thread(struct md_thread __rcu *thread); extern void md_check_recovery(struct mddev *mddev); extern void md_reap_sync_thread(struct mddev *mddev); extern int mddev_init_writes_pending(struct mddev *mddev); diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 1217c1db0a40..64f019e3615f 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -3176,8 +3176,8 @@ static int raid1_run(struct mddev *mddev) /* * Ok, everything is just fine now */ - mddev->thread = conf->thread; - conf->thread = NULL; + rcu_assign_pointer(mddev->thread, conf->thread); + rcu_assign_pointer(conf->thread, NULL); mddev->private = conf; set_bit(MD_FAILFAST_SUPPORTED, &mddev->flags); diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h index ebb6788820e7..468f189da7a0 100644 --- a/drivers/md/raid1.h +++ b/drivers/md/raid1.h @@ -130,7 +130,7 @@ struct r1conf { /* When taking over an array from a different personality, we store * the new thread here until we fully activate the array. */ - struct md_thread *thread; + struct md_thread __rcu *thread; /* Keep track of cluster resync window to send to other * nodes. diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 0171ba4f19b0..8fa4e61c3f79 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -980,6 +980,7 @@ static void lower_barrier(struct r10conf *conf) static bool stop_waiting_barrier(struct r10conf *conf) { struct bio_list *bio_list = current->bio_list; + struct md_thread *thread; /* barrier is dropped */ if (!conf->barrier) @@ -995,8 +996,11 @@ static bool stop_waiting_barrier(struct r10conf *conf) (!bio_list_empty(&bio_list[0]) || !bio_list_empty(&bio_list[1]))) return true; + /* daemon thread must exist while handling io */ + thread = rcu_dereference_protected(conf->mddev->thread, true); + /* move on if recovery thread is blocked by us */ - if (conf->mddev->thread->tsk == current && + if (thread->tsk == current && test_bit(MD_RECOVERY_RUNNING, &conf->mddev->recovery) && conf->nr_queued > 0) return true; @@ -4140,8 +4144,8 @@ static int raid10_run(struct mddev *mddev) } } - mddev->thread = conf->thread; - conf->thread = NULL; + rcu_assign_pointer(mddev->thread, conf->thread); + rcu_assign_pointer(conf->thread, NULL); if (mddev->queue) { blk_queue_max_write_zeroes_sectors(mddev->queue, 0); diff --git a/drivers/md/raid10.h b/drivers/md/raid10.h index 8c072ce0bc54..63e48b11b552 100644 --- a/drivers/md/raid10.h +++ b/drivers/md/raid10.h @@ -100,7 +100,7 @@ struct r10conf { /* When taking over an array from a different personality, we store * the new thread here until we fully activate the array. */ - struct md_thread *thread; + struct md_thread __rcu *thread; /* * Keep track of cluster resync window to send to other nodes. diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c index 0464d4d551fc..68c4d3a1fd25 100644 --- a/drivers/md/raid5-cache.c +++ b/drivers/md/raid5-cache.c @@ -120,7 +120,7 @@ struct r5l_log { struct bio_set bs; mempool_t meta_pool; - struct md_thread *reclaim_thread; + struct md_thread __rcu *reclaim_thread; unsigned long reclaim_target; /* number of space that need to be * reclaimed. if it's 0, reclaim spaces * used by io_units which are in @@ -1576,17 +1576,18 @@ void r5l_wake_reclaim(struct r5l_log *log, sector_t space) void r5l_quiesce(struct r5l_log *log, int quiesce) { - struct mddev *mddev; + struct mddev *mddev = log->rdev->mddev; + struct md_thread *thread = rcu_dereference_protected( + log->reclaim_thread, lockdep_is_held(&mddev->reconfig_mutex)); if (quiesce) { /* make sure r5l_write_super_and_discard_space exits */ - mddev = log->rdev->mddev; wake_up(&mddev->sb_wait); - kthread_park(log->reclaim_thread->tsk); + kthread_park(thread->tsk); r5l_wake_reclaim(log, MaxSector); r5l_do_reclaim(log); } else - kthread_unpark(log->reclaim_thread->tsk); + kthread_unpark(thread->tsk); } bool r5l_log_disk_error(struct r5conf *conf) @@ -3124,7 +3125,9 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev) if (md_register_thread(&log->reclaim_thread, r5l_reclaim_thread, log->rdev->mddev, "reclaim")) goto reclaim_thread; - log->reclaim_thread->timeout = R5C_RECLAIM_WAKEUP_INTERVAL; + + rcu_dereference_protected(log->reclaim_thread, true)->timeout = + R5C_RECLAIM_WAKEUP_INTERVAL; init_waitqueue_head(&log->iounit_wait); diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 04b1093195d0..a7e47c37daf3 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -7888,8 +7888,8 @@ static int raid5_run(struct mddev *mddev) } conf->min_offset_diff = min_offset_diff; - mddev->thread = conf->thread; - conf->thread = NULL; + rcu_assign_pointer(mddev->thread, conf->thread); + rcu_assign_pointer(conf->thread, NULL); mddev->private = conf; for (i = 0; i < conf->raid_disks && conf->previous_raid_disks; diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h index e873938a6125..f19707189a7b 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -679,7 +679,7 @@ struct r5conf { /* When taking over an array from a different personality, we store * the new thread here until we fully activate the array. */ - struct md_thread *thread; + struct md_thread __rcu *thread; struct list_head temp_inactive_list[NR_STRIPE_HASH_LOCKS]; struct r5worker_group *worker_groups; int group_cnt;