From patchwork Mon Sep 18 04:10:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52DB1CD13D1 for ; Mon, 18 Sep 2023 04:12:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232948AbjIREMY (ORCPT ); Mon, 18 Sep 2023 00:12:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230167AbjIREMN (ORCPT ); Mon, 18 Sep 2023 00:12:13 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7F3FFF for ; Sun, 17 Sep 2023 21:11:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010284; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JQ0OGbw8P5+mlbOCGhQ5rG8/PVFXLlF0EMXrQsKOArk=; b=iOQQrEDTelSjiQyXe1gjsc7eC7z8Mc9I+YhQdRy8ylLxbC6XSbD6vlHEa4NeoFNFPhker5 ca86fGpe88h91KEQpwPQVUa23oodkynbmschJ0NpQrJkJxZfbPoeMMyKMX/bqc6qNQfwm8 Qkn58ydSqlv9zkOQ+H1Bsq/aXmWviMw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-695--ILItjkONd2otM7iyI_-Sw-1; Mon, 18 Sep 2023 00:11:23 -0400 X-MC-Unique: -ILItjkONd2otM7iyI_-Sw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C76DC85A5A8; Mon, 18 Sep 2023 04:11:22 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id E09D21005B90; Mon, 18 Sep 2023 04:11:21 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 01/10] io_uring: allocate ctx id and build map between id and ctx Date: Mon, 18 Sep 2023 12:10:57 +0800 Message-Id: <20230918041106.2134250-2-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Prepare for supporting to notify uring_cmd driver when ctx/io_uring_task is going away. Notifier callback will be registered by driver to get notified, so that driver can cancel in-flight command which may depend on the io task. For driver to check if the ctx is matched with uring_cmd, allocate/provide ctx id to the callback, so we can avoid to expose the whole ctx instance. The global xarray of ctx_ids is added for holding the mapping and allocating unique id for each ctx. Signed-off-by: Ming Lei --- include/linux/io_uring.h | 2 ++ include/linux/io_uring_types.h | 3 +++ io_uring/io_uring.c | 9 +++++++++ 3 files changed, 14 insertions(+) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 106cdc55ff3b..ec9714e36477 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -41,6 +41,8 @@ static inline const void *io_uring_sqe_cmd(const struct io_uring_sqe *sqe) return sqe->cmd; } +#define IO_URING_INVALID_CTX_ID UINT_MAX + #if defined(CONFIG_IO_URING) int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, struct iov_iter *iter, void *ioucmd); diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 13d19b9be9f4..d310bb073101 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -215,6 +215,9 @@ struct io_ring_ctx { struct percpu_ref refs; enum task_work_notify_mode notify_method; + + /* for uring cmd driver to retrieve context */ + unsigned int id; } ____cacheline_aligned_in_smp; /* submission data */ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 783ed0fff71b..c015c070ff85 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -175,6 +175,9 @@ static struct ctl_table kernel_io_uring_disabled_table[] = { }; #endif +/* mapping between io_ring_ctx instance and its ctx_id */ +static DEFINE_XARRAY_FLAGS(ctx_ids, XA_FLAGS_ALLOC); + struct sock *io_uring_get_socket(struct file *file) { #if defined(CONFIG_UNIX) @@ -303,6 +306,10 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) xa_init(&ctx->io_bl_xa); + ctx->id = IO_URING_INVALID_CTX_ID; + if (xa_alloc(&ctx_ids, &ctx->id, ctx, xa_limit_31b, GFP_KERNEL)) + goto err; + /* * Use 5 bits less than the max cq entries, that should give us around * 32 entries per hash list if totally full and uniformly spread, but @@ -356,6 +363,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) kfree(ctx->cancel_table_locked.hbs); kfree(ctx->io_bl); xa_destroy(&ctx->io_bl_xa); + xa_erase(&ctx_ids, ctx->id); kfree(ctx); return NULL; } @@ -2929,6 +2937,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) kfree(ctx->cancel_table_locked.hbs); kfree(ctx->io_bl); xa_destroy(&ctx->io_bl_xa); + xa_erase(&ctx_ids, ctx->id); kfree(ctx); } From patchwork Mon Sep 18 04:10:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 927C3CD13D8 for ; Mon, 18 Sep 2023 04:12:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233442AbjIREMZ (ORCPT ); Mon, 18 Sep 2023 00:12:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239434AbjIREMV (ORCPT ); Mon, 18 Sep 2023 00:12:21 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DB97115 for ; Sun, 17 Sep 2023 21:11:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010289; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cPWT02s2CM706fOFGqrBFd6kGdqjwMylb/OMZksDA8s=; b=QfrQl+X+lTB34i7Uz2fY4v5lmDTcaK3Tgr0DuD8iBTW57tSu3SG0Ljcq4KE6527bvasQnn IDvvbe6/JTLzwWl1EarW7FPqPgXga+i7Xt+Ke6jgr0G11E661+IgceNoXWIDw5IVNADJMU y26eFPhPfv+l2KkHWosBwD3NA3rVotQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-97-TtqFWRStO_mLqTTwQlvOXg-1; Mon, 18 Sep 2023 00:11:27 -0400 X-MC-Unique: TtqFWRStO_mLqTTwQlvOXg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 687E0801779; Mon, 18 Sep 2023 04:11:27 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id 89D5640C2064; Mon, 18 Sep 2023 04:11:25 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 02/10] io_uring: pass io_uring_ctx->id to uring_cmd Date: Mon, 18 Sep 2023 12:10:58 +0800 Message-Id: <20230918041106.2134250-3-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Pass io_uring_ctx-id to uring_cmd driver, and prepare for supporting io_uring ctx or task exit notifier. Signed-off-by: Ming Lei --- include/linux/io_uring.h | 6 +++++- io_uring/uring_cmd.c | 1 + 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index ec9714e36477..c395807bd7cf 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -33,7 +33,11 @@ struct io_uring_cmd { }; u32 cmd_op; u32 flags; - u8 pdu[32]; /* available inline for free use */ + union { + /* driver needs to save ctx_id */ + u32 ctx_id; + u8 pdu[32]; /* available inline for free use */ + }; }; static inline const void *io_uring_sqe_cmd(const struct io_uring_sqe *sqe) diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 537795fddc87..c54c627fb6b9 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -105,6 +105,7 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) req->imu = ctx->user_bufs[index]; io_req_set_rsrc_node(req, ctx, 0); } + ioucmd->ctx_id = req->ctx->id; ioucmd->sqe = sqe; ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); return 0; From patchwork Mon Sep 18 04:10:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20BF9C46CA1 for ; Mon, 18 Sep 2023 04:13:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236993AbjIREM4 (ORCPT ); Mon, 18 Sep 2023 00:12:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239221AbjIREMv (ORCPT ); Mon, 18 Sep 2023 00:12:51 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92194120 for ; Sun, 17 Sep 2023 21:11:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010299; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=52B/d8oY7jJWZfrym56R0BHRAS7KgEfK4SZXaam5tP4=; b=Cl7ckxC6rGmZwNM+hIZQ4cnTX9UJ6raedMg66/lgO2qCU2kj7t/OS7LvyEEvu6pPGk5i3g smJJKzmrHO0Sdj0/osEj3TTYzsjum57pXMkptAzyzAXahQuzX6a0FvDpCo8D+6Xn1IfZ5W rE3xci0tlMpYSKheps29Jx93t0LQWj4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-570-hi4SGSw2M5e8G1WXfau-Rg-1; Mon, 18 Sep 2023 00:11:32 -0400 X-MC-Unique: hi4SGSw2M5e8G1WXfau-Rg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E4181811E7B; Mon, 18 Sep 2023 04:11:31 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id E6EB6176C3; Mon, 18 Sep 2023 04:11:30 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 03/10] io_uring: support io_uring notifier for uring_cmd Date: Mon, 18 Sep 2023 12:10:59 +0800 Message-Id: <20230918041106.2134250-4-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Notifier callback is registered by driver to get notified, so far only for uring_cmd based driver. With this notifier, driver can cancel in-flight command when ctx is being released or io task is exiting. The main use case is ublk(or probably fuse with uring cmd support) in which uring command may never complete, so driver has to cancel this command when io task is exiting or ctx is releasing, otherwise __io_uring_cancel() may wait forever because of inflight commands. Signed-off-by: Ming Lei --- include/linux/io_uring.h | 19 ++++++++++++++++ io_uring/io_uring.c | 48 ++++++++++++++++++++++++++++++++++++++++ io_uring/io_uring.h | 4 ++++ io_uring/uring_cmd.c | 12 ++++++++++ 4 files changed, 83 insertions(+) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index c395807bd7cf..037bff9960a1 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -47,7 +47,19 @@ static inline const void *io_uring_sqe_cmd(const struct io_uring_sqe *sqe) #define IO_URING_INVALID_CTX_ID UINT_MAX +enum io_uring_notifier { + IO_URING_NOTIFIER_CTX_EXIT, + IO_URING_NOTIFIER_IO_TASK_EXIT, +}; + +struct io_uring_notifier_data { + unsigned int ctx_id; + const struct task_struct *task; +}; + #if defined(CONFIG_IO_URING) +int io_uring_cmd_register_notifier(struct notifier_block *nb); +void io_uring_cmd_unregister_notifier(struct notifier_block *nb); int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, struct iov_iter *iter, void *ioucmd); void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2, @@ -89,6 +101,13 @@ static inline void io_uring_free(struct task_struct *tsk) } int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags); #else +static inline int io_uring_cmd_register_notifier(struct notifier_block *nb) +{ + return 0; +} +static inline void io_uring_cmd_unregister_notifier(struct notifier_block *nb) +{ +} static inline int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, struct iov_iter *iter, void *ioucmd) { diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index c015c070ff85..de9b217bf5d8 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -73,6 +73,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -178,6 +179,22 @@ static struct ctl_table kernel_io_uring_disabled_table[] = { /* mapping between io_ring_ctx instance and its ctx_id */ static DEFINE_XARRAY_FLAGS(ctx_ids, XA_FLAGS_ALLOC); +/* + * Uring_cmd driver can register to be notified when ctx/io_uring_task + * is going away for canceling inflight commands. + */ +static struct srcu_notifier_head notifier_chain; + +int io_uring_register_notifier(struct notifier_block *nb) +{ + return srcu_notifier_chain_register(¬ifier_chain, nb); +} + +void io_uring_unregister_notifier(struct notifier_block *nb) +{ + srcu_notifier_chain_unregister(¬ifier_chain, nb); +} + struct sock *io_uring_get_socket(struct file *file) { #if defined(CONFIG_UNIX) @@ -191,6 +208,11 @@ struct sock *io_uring_get_socket(struct file *file) } EXPORT_SYMBOL(io_uring_get_socket); +struct io_ring_ctx *io_uring_id_to_ctx(unsigned int id) +{ + return (struct io_ring_ctx *)xa_load(&ctx_ids, id); +} + static inline void io_submit_flush_completions(struct io_ring_ctx *ctx) { if (!wq_list_empty(&ctx->submit_state.compl_reqs) || @@ -3060,6 +3082,23 @@ static __cold bool io_cancel_ctx_cb(struct io_wq_work *work, void *data) return req->ctx == data; } +static __cold void io_uring_cancel_notify(struct io_ring_ctx *ctx, + struct task_struct *task) +{ + struct io_uring_notifier_data notifier_data = { + .ctx_id = ctx->id, + .task = task, + }; + enum io_uring_notifier notifier; + + if (!task) + notifier = IO_URING_NOTIFIER_CTX_EXIT; + else + notifier = IO_URING_NOTIFIER_IO_TASK_EXIT; + + srcu_notifier_call_chain(¬ifier_chain, notifier, ¬ifier_data); +} + static __cold void io_ring_exit_work(struct work_struct *work) { struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, exit_work); @@ -3069,6 +3108,8 @@ static __cold void io_ring_exit_work(struct work_struct *work) struct io_tctx_node *node; int ret; + io_uring_cancel_notify(ctx, NULL); + /* * If we're doing polled IO and end up having requests being * submitted async (out-of-line), then completions can come in while @@ -3346,6 +3387,11 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) if (tctx->io_wq) io_wq_exit_start(tctx->io_wq); + if (!cancel_all) { + xa_for_each(&tctx->xa, index, node) + io_uring_cancel_notify(node->ctx, current); + } + atomic_inc(&tctx->in_cancel); do { bool loop = false; @@ -4695,6 +4741,8 @@ static int __init io_uring_init(void) register_sysctl_init("kernel", kernel_io_uring_disabled_table); #endif + srcu_init_notifier_head(¬ifier_chain); + return 0; }; __initcall(io_uring_init); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 547c30582fb8..1d5588d8a88a 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -38,6 +38,10 @@ enum { IOU_STOP_MULTISHOT = -ECANCELED, }; +struct io_ring_ctx *io_uring_id_to_ctx(unsigned int id); +int io_uring_register_notifier(struct notifier_block *nb); +void io_uring_unregister_notifier(struct notifier_block *nb); + bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow); void io_req_cqe_overflow(struct io_kiocb *req); int io_run_task_work_sig(struct io_ring_ctx *ctx); diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index c54c627fb6b9..03e3a8c1b712 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -192,3 +192,15 @@ int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags) } } EXPORT_SYMBOL_GPL(io_uring_cmd_sock); + +int io_uring_cmd_register_notifier(struct notifier_block *nb) +{ + return io_uring_register_notifier(nb); +} +EXPORT_SYMBOL_GPL(io_uring_cmd_register_notifier); + +void io_uring_cmd_unregister_notifier(struct notifier_block *nb) +{ + io_uring_unregister_notifier(nb); +} +EXPORT_SYMBOL_GPL(io_uring_cmd_unregister_notifier); From patchwork Mon Sep 18 04:11:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 777EFCD13D2 for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239060AbjIREN3 (ORCPT ); Mon, 18 Sep 2023 00:13:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239434AbjIREM7 (ORCPT ); Mon, 18 Sep 2023 00:12:59 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1184124 for ; Sun, 17 Sep 2023 21:11:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010301; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ek6c8oKP7WrCCE1Di4TBl2jO8h2lwVhBOJLN+tp+jLk=; b=R5t+9UYMO89lSNnsoKdh7le15vuhaDolOvxruSytyOw1/Kec2QqZKoKHd2BhozTMoQ6B8X AFXyjFfydX+M8sDc+CnaeG7euQVe3VAeX+hI0XaIycLTFdvjyPFGlg2SkmTKY3W9Fp6B8y OYvLZXgoPIwLKqENA58aIKkAzn1Ilys= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-528-0tFN9_eMNoWyqQusowO1og-1; Mon, 18 Sep 2023 00:11:36 -0400 X-MC-Unique: 0tFN9_eMNoWyqQusowO1og-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8FF38945922; Mon, 18 Sep 2023 04:11:36 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id 963F02026D4B; Mon, 18 Sep 2023 04:11:35 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 04/10] ublk: don't get ublk device reference in ublk_abort_queue() Date: Mon, 18 Sep 2023 12:11:00 +0800 Message-Id: <20230918041106.2134250-5-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org ublk_abort_queue() is called in ublk_daemon_monitor_work(), in which it is guaranteed that ublk device is live, so no need to get the device reference. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 630ddfe6657b..9b3c0b3dd36e 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1419,9 +1419,6 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) { int i; - if (!ublk_get_device(ub)) - return; - for (i = 0; i < ubq->q_depth; i++) { struct ublk_io *io = &ubq->ios[i]; @@ -1437,7 +1434,6 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) __ublk_fail_req(ubq, io, rq); } } - ublk_put_device(ub); } static void ublk_daemon_monitor_work(struct work_struct *work) From patchwork Mon Sep 18 04:11:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388784 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3389CD37B0 for ; Mon, 18 Sep 2023 04:13:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239460AbjIREN2 (ORCPT ); Mon, 18 Sep 2023 00:13:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239060AbjIREM6 (ORCPT ); Mon, 18 Sep 2023 00:12:58 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 919CE129 for ; Sun, 17 Sep 2023 21:11:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010303; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+D1tdoJY1LNEs7CO9yqwlRl9HUhkTckA0MTM2Mzt9+U=; b=EeWKGXSvC6AMWc80+Z9d4kKV2V35pxWKucM65Sr/SanfRenQMne5b7n4JELIyW0uaCtD8J 5HP5zQ4x7rb23GyXNp0PP8uU/1jTReyrbHf6aKEJhjARH6AKR6dM2Lo1kMrLKgBLUmILy8 3BiKWoZuQ3dLWac/ELOFjxoqLT3S7UQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-511-8mnrePgoPseT-tlH8xPF_A-1; Mon, 18 Sep 2023 00:11:40 -0400 X-MC-Unique: 8mnrePgoPseT-tlH8xPF_A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 775E585A5BA; Mon, 18 Sep 2023 04:11:40 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8AF9240C6EA8; Mon, 18 Sep 2023 04:11:38 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 05/10] ublk: make sure ublk uring cmd handling is done in submitter task context Date: Mon, 18 Sep 2023 12:11:01 +0800 Message-Id: <20230918041106.2134250-6-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org In well-done ublk server implementation, ublk io command won't be linked into any link chain. Meantime we are always handled in no-wait style, so basically ublk uring cmd handling is always done in submitter task context. However, the server may set IOSQE_ASYNC, or io command is linked to one chain mistakenly, then we may still run into io-wq context and ctx->uring_lock isn't held. So in case of IO_URING_F_UNLOCKED, schedule this command by io_uring_cmd_complete_in_task to run it in submitter task. Then ublk_ch_uring_cmd_local() is always called with context uring_lock held, and we needn't to worry about sync among submission code path. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 9b3c0b3dd36e..46d499d96ca3 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1810,7 +1810,8 @@ static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub, return NULL; } -static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) +static inline int ublk_ch_uring_cmd_local(struct io_uring_cmd *cmd, + unsigned int issue_flags) { /* * Not necessary for async retry, but let's keep it simple and always @@ -1824,9 +1825,28 @@ static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) .addr = READ_ONCE(ub_src->addr) }; + WARN_ON_ONCE(issue_flags & IO_URING_F_UNLOCKED); + return __ublk_ch_uring_cmd(cmd, issue_flags, &ub_cmd); } +static void ublk_ch_uring_cmd_cb(struct io_uring_cmd *cmd, + unsigned int issue_flags) +{ + ublk_ch_uring_cmd_local(cmd, issue_flags); +} + +static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) +{ + /* well-implemented server won't run into unlocked */ + if (unlikely(issue_flags & IO_URING_F_UNLOCKED)) { + io_uring_cmd_complete_in_task(cmd, ublk_ch_uring_cmd_cb); + return -EIOCBQUEUED; + } + + return ublk_ch_uring_cmd_local(cmd, issue_flags); +} + static inline bool ublk_check_ubuf_dir(const struct request *req, int ubuf_dir) { From patchwork Mon Sep 18 04:11:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388786 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5877AC46CA1 for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238654AbjIREN3 (ORCPT ); Mon, 18 Sep 2023 00:13:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239496AbjIRENE (ORCPT ); Mon, 18 Sep 2023 00:13:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D21D312B for ; Sun, 17 Sep 2023 21:11:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010309; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9YlbJwv7tXUgE/3jOxnqej4cCUaduO2XvxGilksrAig=; b=AWmG7BJvtAY5sX+/XNFfXlF+MZTG1dvJ0goRr/B3kJoh1ThdljtX3zq1J7+jJNIvdzyh2W lms6N5JDyYtm70jHKxE0cPZP7sysDOnYf/y6nf5Jez03R5IPX7xfqLmva8y+nWU1finPK7 L7UMMEks2cTrXD6ckshAV2mSP0dcotI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-418-22Vb2Q6dMM2L7W6lowtD_g-1; Mon, 18 Sep 2023 00:11:45 -0400 X-MC-Unique: 22Vb2Q6dMM2L7W6lowtD_g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E03F3185A790; Mon, 18 Sep 2023 04:11:44 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id F118B40C6EA8; Mon, 18 Sep 2023 04:11:43 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 06/10] ublk: make sure that uring cmd aiming at same queue won't cross io_uring contexts Date: Mon, 18 Sep 2023 12:11:02 +0800 Message-Id: <20230918041106.2134250-7-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Make sure that all commands aiming at same ublk queue are from same io_uring context. This way is one very reasonable requirement, and not see any reason userspace may send uring cmd to same queue by multiple io_uring contexts. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 46d499d96ca3..52dd53662ffb 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -131,6 +131,7 @@ struct ublk_queue { unsigned long flags; struct task_struct *ubq_daemon; char *io_cmd_buf; + unsigned int ctx_id; struct llist_head io_cmds; @@ -1410,6 +1411,11 @@ static void ublk_commit_completion(struct ublk_device *ub, ublk_put_req_ref(ubq, req); } +static inline bool ublk_ctx_id_is_valid(unsigned int ctx_id) +{ + return ctx_id != IO_URING_INVALID_CTX_ID; +} + /* * When ->ubq_daemon is exiting, either new request is ended immediately, * or any queued io command is drained, so it is safe to abort queue @@ -1609,11 +1615,13 @@ static void ublk_stop_dev(struct ublk_device *ub) } /* device can only be started after all IOs are ready */ -static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq) +static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq, + unsigned int ctx_id) { mutex_lock(&ub->mutex); ubq->nr_io_ready++; if (ublk_queue_ready(ubq)) { + ubq->ctx_id = ctx_id; ubq->ubq_daemon = current; get_task_struct(ubq->ubq_daemon); ub->nr_queues_ready++; @@ -1682,6 +1690,9 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, if (ubq->ubq_daemon && ubq->ubq_daemon != current) goto out; + if (ublk_ctx_id_is_valid(ubq->ctx_id) && cmd->ctx_id != ubq->ctx_id) + goto out; + if (tag >= ubq->q_depth) goto out; @@ -1734,7 +1745,7 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, } ublk_fill_io_cmd(io, cmd, ub_cmd->addr); - ublk_mark_io_ready(ub, ubq); + ublk_mark_io_ready(ub, ubq, cmd->ctx_id); break; case UBLK_IO_COMMIT_AND_FETCH_REQ: req = blk_mq_tag_to_rq(ub->tag_set.tags[ub_cmd->q_id], tag); @@ -1989,6 +2000,7 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id) ubq->io_cmd_buf = ptr; ubq->dev = ub; + ubq->ctx_id = IO_URING_INVALID_CTX_ID; return 0; } @@ -2593,6 +2605,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq) ubq->ubq_daemon = NULL; ubq->timeout = false; + ubq->ctx_id = IO_URING_INVALID_CTX_ID; + for (i = 0; i < ubq->q_depth; i++) { struct ublk_io *io = &ubq->ios[i]; From patchwork Mon Sep 18 04:11:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388790 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A94ACD3421 for ; Mon, 18 Sep 2023 04:13:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239487AbjIRENc (ORCPT ); Mon, 18 Sep 2023 00:13:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239531AbjIRENH (ORCPT ); Mon, 18 Sep 2023 00:13:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D726130 for ; Sun, 17 Sep 2023 21:11:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010312; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jFn9Cdtk+7JZBpIhDNO+U1L5R3ZpxdukDn8LrFmDXLI=; b=dzKbhMtFMuC13Vsu4APF85sQpr6kTsfimmJBcSy1fhIIBuhkwOka9uxXxkJX3h2dCLinur gQOcZsy4PETfWqmlji99ONQNkG+h5IqXhdBwbsycqQr0EXxImoB5Q9onman1f6Sq01qxDN LfBxUXUSdqfC6VwApSuwRVn3clwe/4M= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-153-uPS8f0-MMn-QkJx2tfOKyQ-1; Mon, 18 Sep 2023 00:11:49 -0400 X-MC-Unique: uPS8f0-MMn-QkJx2tfOKyQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BE896185A78E; Mon, 18 Sep 2023 04:11:48 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBD9A2156701; Mon, 18 Sep 2023 04:11:47 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 07/10] ublk: rename mm_lock as lock Date: Mon, 18 Sep 2023 12:11:03 +0800 Message-Id: <20230918041106.2134250-8-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Rename the mm_lock field of ublk_device as lock, so that this lock can be reused for protecting access of ub->ub_disk, which will be used for simplifying ublk_abort_queue() by quiesce queue in next patch. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 52dd53662ffb..4bc4c4f87b36 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -167,7 +167,7 @@ struct ublk_device { struct mutex mutex; - spinlock_t mm_lock; + spinlock_t lock; struct mm_struct *mm; struct ublk_params params; @@ -1358,12 +1358,12 @@ static int ublk_ch_mmap(struct file *filp, struct vm_area_struct *vma) unsigned long pfn, end, phys_off = vma->vm_pgoff << PAGE_SHIFT; int q_id, ret = 0; - spin_lock(&ub->mm_lock); + spin_lock(&ub->lock); if (!ub->mm) ub->mm = current->mm; if (current->mm != ub->mm) ret = -EINVAL; - spin_unlock(&ub->mm_lock); + spin_unlock(&ub->lock); if (ret) return ret; @@ -2348,7 +2348,7 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) if (!ub) goto out_unlock; mutex_init(&ub->mutex); - spin_lock_init(&ub->mm_lock); + spin_lock_init(&ub->lock); INIT_WORK(&ub->quiesce_work, ublk_quiesce_work_fn); INIT_WORK(&ub->stop_work, ublk_stop_work_fn); INIT_DELAYED_WORK(&ub->monitor_work, ublk_daemon_monitor_work); From patchwork Mon Sep 18 04:11:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC43DCD13DE for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239484AbjIRENb (ORCPT ); Mon, 18 Sep 2023 00:13:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239532AbjIRENH (ORCPT ); Mon, 18 Sep 2023 00:13:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49562137 for ; Sun, 17 Sep 2023 21:12:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010323; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e772wTQJHJwxBOXO8UBygjxubDYmCkNIw23G8H1LTp0=; b=VmTPfrnmbOJbMP5o1yF28hfsdHskb0CiD4EzroUYQMU07tcrPYhvxvylh8WdkChVZBDBIk 9reY4nnrPQX4U9ToymifBbwvM3sLCBKa+6WiJk4fzMyaKj/B4GBAmC32SQwz82p/xQTl/7 7E2BCao6gtNbazmRrNPY7NjCEMReKA8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-543-AW4f9IqIOCWV_UvVVcyz8A-1; Mon, 18 Sep 2023 00:11:53 -0400 X-MC-Unique: AW4f9IqIOCWV_UvVVcyz8A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E877C85A5BA; Mon, 18 Sep 2023 04:11:52 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id F058740C6EA8; Mon, 18 Sep 2023 04:11:51 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 08/10] ublk: quiesce request queue when aborting queue Date: Mon, 18 Sep 2023 12:11:04 +0800 Message-Id: <20230918041106.2134250-9-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org So far aborting queue ends request when the ubq daemon is exiting, and it can be run concurrently with ublk_queue_rq(), this way is fragile and we depend on the tricky usage of UBLK_IO_FLAG_ABORTED for avoiding such race. Quiesce queue when aborting queue, and the two code paths can be run completely exclusively, then it becomes easier to add new ublk feature, such as relaxing single same task limit for each queue. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 43 ++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 4bc4c4f87b36..3b691bf3d9ef 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1446,21 +1446,45 @@ static void ublk_daemon_monitor_work(struct work_struct *work) { struct ublk_device *ub = container_of(work, struct ublk_device, monitor_work.work); + struct gendisk *disk; int i; for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { struct ublk_queue *ubq = ublk_get_queue(ub, i); - if (ubq_daemon_is_dying(ubq)) { - if (ublk_queue_can_use_recovery(ubq)) - schedule_work(&ub->quiesce_work); - else - schedule_work(&ub->stop_work); + if (ubq_daemon_is_dying(ubq)) + goto found; + } + return; + +found: + spin_lock(&ub->lock); + disk = ub->ub_disk; + if (disk) + get_device(disk_to_dev(disk)); + spin_unlock(&ub->lock); + + /* Our disk has been dead */ + if (!disk) + return; + + /* Now we are serialized with ublk_queue_rq() */ + blk_mq_quiesce_queue(disk->queue); + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { + struct ublk_queue *ubq = ublk_get_queue(ub, i); + if (ubq_daemon_is_dying(ubq)) { /* abort queue is for making forward progress */ ublk_abort_queue(ub, ubq); } } + blk_mq_unquiesce_queue(disk->queue); + put_device(disk_to_dev(disk)); + + if (ublk_can_use_recovery(ub)) + schedule_work(&ub->quiesce_work); + else + schedule_work(&ub->stop_work); /* * We can't schedule monitor work after ub's state is not UBLK_S_DEV_LIVE. @@ -1595,6 +1619,8 @@ static void ublk_unquiesce_dev(struct ublk_device *ub) static void ublk_stop_dev(struct ublk_device *ub) { + struct gendisk *disk; + mutex_lock(&ub->mutex); if (ub->dev_info.state == UBLK_S_DEV_DEAD) goto unlock; @@ -1604,10 +1630,15 @@ static void ublk_stop_dev(struct ublk_device *ub) ublk_unquiesce_dev(ub); } del_gendisk(ub->ub_disk); + + /* Sync with ublk_abort_queue() by holding the lock */ + spin_lock(&ub->lock); + disk = ub->ub_disk; ub->dev_info.state = UBLK_S_DEV_DEAD; ub->dev_info.ublksrv_pid = -1; - put_disk(ub->ub_disk); ub->ub_disk = NULL; + spin_unlock(&ub->lock); + put_disk(disk); unlock: ublk_cancel_dev(ub); mutex_unlock(&ub->mutex); From patchwork Mon Sep 18 04:11:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F030ECD13DF for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239500AbjIRENc (ORCPT ); Mon, 18 Sep 2023 00:13:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239544AbjIRENJ (ORCPT ); Mon, 18 Sep 2023 00:13:09 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EB44135 for ; Sun, 17 Sep 2023 21:12:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010320; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YRG0DD08DVTXjsynnmBGtcrnrZop7rwgneNtLdOiHFs=; b=Gv1U4Gw2Iwbi0gXXrYCqrp1Po59w0PreZirarpS6epKqOdZpgDhkQxqcdJ+bscuNZSQ1sY VyUhsYqfZJ3jCLQpejdcon8foBABrRNq3LyaiLa64tgflgOEtlOKrmKX/s5QuN/itV80JS H1/bu1qGy6uzjFm3GByBW7WxHqRH7gc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-562-rwabaOyeOm-jyuK2eAP3KQ-1; Mon, 18 Sep 2023 00:11:57 -0400 X-MC-Unique: rwabaOyeOm-jyuK2eAP3KQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C90A51875054; Mon, 18 Sep 2023 04:11:56 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id BA9BB10F1BE9; Mon, 18 Sep 2023 04:11:55 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 09/10] ublk: replace monitor work with uring_cmd exit notifier Date: Mon, 18 Sep 2023 12:11:05 +0800 Message-Id: <20230918041106.2134250-10-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Monitor work actually introduces one extra context for handling abort, this way is easy to cause race, and also introduce extra delay when handling aborting. Now uring_cmd introduces exit notifier, so use it instead: 1) this notifier callback is either run from the uring cmd submission task context or called after the io_uring context is exit, so the callback is run exclusively with ublk_ch_uring_cmd() and __ublk_rq_task_work(). 2) the previous patch freezes request queue when calling ublk_abort_queue(), which is now completely exclusive with ublk_queue_rq() and ublk_ch_uring_cmd()/__ublk_rq_task_work(). This way simplifies aborting queue, and is helpful for adding new feature, such as, relax the limit of using single task for handling one queue. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 89 +++++++++++++++++++--------------------- 1 file changed, 42 insertions(+), 47 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 3b691bf3d9ef..90e0137ff784 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -144,8 +144,6 @@ struct ublk_queue { struct ublk_io ios[]; }; -#define UBLK_DAEMON_MONITOR_PERIOD (5 * HZ) - struct ublk_device { struct gendisk *ub_disk; @@ -176,11 +174,7 @@ struct ublk_device { unsigned int nr_queues_ready; unsigned int nr_privileged_daemon; - /* - * Our ubq->daemon may be killed without any notification, so - * monitor each queue's daemon periodically - */ - struct delayed_work monitor_work; + struct notifier_block notif; struct work_struct quiesce_work; struct work_struct stop_work; }; @@ -194,7 +188,6 @@ struct ublk_params_header { static inline unsigned int ublk_req_build_flags(struct request *req); static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq, int tag); - static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub) { return ub->dev_info.flags & UBLK_F_USER_COPY; @@ -1119,8 +1112,6 @@ static inline void __ublk_abort_rq(struct ublk_queue *ubq, blk_mq_requeue_request(rq, false); else blk_mq_end_request(rq, BLK_STS_IOERR); - - mod_delayed_work(system_wq, &ubq->dev->monitor_work, 0); } static inline void __ublk_rq_task_work(struct request *req, @@ -1241,12 +1232,12 @@ static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq) io = &ubq->ios[rq->tag]; /* * If the check pass, we know that this is a re-issued request aborted - * previously in monitor_work because the ubq_daemon(cmd's task) is + * previously in exit notifier because the ubq_daemon(cmd's task) is * PF_EXITING. We cannot call io_uring_cmd_complete_in_task() anymore * because this ioucmd's io_uring context may be freed now if no inflight * ioucmd exists. Otherwise we may cause null-deref in ctx->fallback_work. * - * Note: monitor_work sets UBLK_IO_FLAG_ABORTED and ends this request(releasing + * Note: exit notifier sets UBLK_IO_FLAG_ABORTED and ends this request(releasing * the tag). Then the request is re-started(allocating the tag) and we are here. * Since releasing/allocating a tag implies smp_mb(), finding UBLK_IO_FLAG_ABORTED * guarantees that here is a re-issued request aborted previously. @@ -1334,9 +1325,17 @@ static int ublk_ch_open(struct inode *inode, struct file *filp) { struct ublk_device *ub = container_of(inode->i_cdev, struct ublk_device, cdev); + int ret; if (test_and_set_bit(UB_STATE_OPEN, &ub->state)) return -EBUSY; + + ret = io_uring_cmd_register_notifier(&ub->notif); + if (ret) { + clear_bit(UB_STATE_OPEN, &ub->state); + return ret; + } + filp->private_data = ub; return 0; } @@ -1346,6 +1345,8 @@ static int ublk_ch_release(struct inode *inode, struct file *filp) struct ublk_device *ub = filp->private_data; clear_bit(UB_STATE_OPEN, &ub->state); + io_uring_cmd_unregister_notifier(&ub->notif); + return 0; } @@ -1417,9 +1418,9 @@ static inline bool ublk_ctx_id_is_valid(unsigned int ctx_id) } /* - * When ->ubq_daemon is exiting, either new request is ended immediately, - * or any queued io command is drained, so it is safe to abort queue - * lockless + * Called from ubq_daemon context via the notifier, meantime quiesce ublk + * blk-mq queue, so we are called exclusively with blk-mq and ubq_daemon + * context, so everything is serialized. */ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) { @@ -1442,20 +1443,34 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) } } -static void ublk_daemon_monitor_work(struct work_struct *work) +static inline bool ubq_daemon_is_exiting(const struct ublk_queue *ubq, + const struct io_uring_notifier_data *data) { - struct ublk_device *ub = - container_of(work, struct ublk_device, monitor_work.work); + return data->ctx_id == ubq->ctx_id && (!data->task || + data->task == ubq->ubq_daemon); +} + +static int ublk_notifier_cb(struct notifier_block *nb, + unsigned long event, void *val) +{ + struct ublk_device *ub = container_of(nb, struct ublk_device, notif); + struct io_uring_notifier_data *data = val; struct gendisk *disk; int i; + pr_devel("%s: event %lu ctx_id %u task %p\n", __func__, event, + data->ctx_id, data->task); + + if (!ublk_ctx_id_is_valid(data->ctx_id)) + return 0; + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { struct ublk_queue *ubq = ublk_get_queue(ub, i); - if (ubq_daemon_is_dying(ubq)) + if (ubq_daemon_is_exiting(ubq, data)) goto found; } - return; + return 0; found: spin_lock(&ub->lock); @@ -1466,14 +1481,14 @@ static void ublk_daemon_monitor_work(struct work_struct *work) /* Our disk has been dead */ if (!disk) - return; + return 0; /* Now we are serialized with ublk_queue_rq() */ blk_mq_quiesce_queue(disk->queue); for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { struct ublk_queue *ubq = ublk_get_queue(ub, i); - if (ubq_daemon_is_dying(ubq)) { + if (ubq_daemon_is_exiting(ubq, data)) { /* abort queue is for making forward progress */ ublk_abort_queue(ub, ubq); } @@ -1485,17 +1500,7 @@ static void ublk_daemon_monitor_work(struct work_struct *work) schedule_work(&ub->quiesce_work); else schedule_work(&ub->stop_work); - - /* - * We can't schedule monitor work after ub's state is not UBLK_S_DEV_LIVE. - * after ublk_remove() or __ublk_quiesce_dev() is started. - * - * No need ub->mutex, monitor work are canceled after state is marked - * as not LIVE, so new state is observed reliably. - */ - if (ub->dev_info.state == UBLK_S_DEV_LIVE) - schedule_delayed_work(&ub->monitor_work, - UBLK_DAEMON_MONITOR_PERIOD); + return 0; } static inline bool ublk_queue_ready(struct ublk_queue *ubq) @@ -1512,6 +1517,9 @@ static void ublk_cancel_queue(struct ublk_queue *ubq) { int i; + if (ublk_ctx_id_is_valid(ubq->ctx_id)) + ubq->ctx_id = IO_URING_INVALID_CTX_ID; + if (!ublk_queue_ready(ubq)) return; @@ -1572,15 +1580,6 @@ static void __ublk_quiesce_dev(struct ublk_device *ub) ublk_wait_tagset_rqs_idle(ub); ub->dev_info.state = UBLK_S_DEV_QUIESCED; ublk_cancel_dev(ub); - /* we are going to release task_struct of ubq_daemon and resets - * ->ubq_daemon to NULL. So in monitor_work, check on ubq_daemon causes UAF. - * Besides, monitor_work is not necessary in QUIESCED state since we have - * already scheduled quiesce_work and quiesced all ubqs. - * - * Do not let monitor_work schedule itself if state it QUIESCED. And we cancel - * it here and re-schedule it in END_USER_RECOVERY to avoid UAF. - */ - cancel_delayed_work_sync(&ub->monitor_work); } static void ublk_quiesce_work_fn(struct work_struct *work) @@ -1642,7 +1641,6 @@ static void ublk_stop_dev(struct ublk_device *ub) unlock: ublk_cancel_dev(ub); mutex_unlock(&ub->mutex); - cancel_delayed_work_sync(&ub->monitor_work); } /* device can only be started after all IOs are ready */ @@ -2210,8 +2208,6 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd) if (wait_for_completion_interruptible(&ub->completion) != 0) return -EINTR; - schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD); - mutex_lock(&ub->mutex); if (ub->dev_info.state == UBLK_S_DEV_LIVE || test_bit(UB_STATE_USED, &ub->state)) { @@ -2382,7 +2378,7 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) spin_lock_init(&ub->lock); INIT_WORK(&ub->quiesce_work, ublk_quiesce_work_fn); INIT_WORK(&ub->stop_work, ublk_stop_work_fn); - INIT_DELAYED_WORK(&ub->monitor_work, ublk_daemon_monitor_work); + ub->notif.notifier_call = ublk_notifier_cb; ret = ublk_alloc_dev_number(ub, header->dev_id); if (ret < 0) @@ -2722,7 +2718,6 @@ static int ublk_ctrl_end_recovery(struct ublk_device *ub, __func__, header->dev_id); blk_mq_kick_requeue_list(ub->ub_disk->queue); ub->dev_info.state = UBLK_S_DEV_LIVE; - schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD); ret = 0; out_unlock: mutex_unlock(&ub->mutex); From patchwork Mon Sep 18 04:11:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388788 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CACE6CD13DD for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239483AbjIRENa (ORCPT ); Mon, 18 Sep 2023 00:13:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239523AbjIRENF (ORCPT ); Mon, 18 Sep 2023 00:13:05 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B74F185 for ; Sun, 17 Sep 2023 21:12:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010327; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7H3F07jM47TzAxqTwg2x9XRWNY0CBRdRHslqoChDfbc=; b=ILXhFGVb36sRL038E7zVXVu9X6iU188s1fsVkgLwnj8jshUAHnCofUuH7PLCLJbesngm29 9liuoEqBtw724KYPQvh0tVGk4psrmnwwc9ijPe7RPSajoGJGeFJbfrPJgqy8S5jR3Wxl3d Y3ZZOYgHafWnb/eSzvbrX+W+x5MHjwc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-692-Gxpd5jc0PA2L0sieXE1sbQ-1; Mon, 18 Sep 2023 00:12:02 -0400 X-MC-Unique: Gxpd5jc0PA2L0sieXE1sbQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B49741875045; Mon, 18 Sep 2023 04:12:01 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id A785B1006B6D; Mon, 18 Sep 2023 04:12:00 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 10/10] ublk: simplify aborting request Date: Mon, 18 Sep 2023 12:11:06 +0800 Message-Id: <20230918041106.2134250-11-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Now ublk_abort_queue() is run exclusively with ublk_queue_rq() and the ubq_daemon task, so simplify aborting request: - set UBLK_IO_FLAG_ABORTED in ublk_abort_queue() - abort request in ublk_queue_rq() if UBLK_IO_FLAG_ABORTED is observed, and no need to check ubq_daemon any more Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 46 ++++++++++++---------------------------- 1 file changed, 14 insertions(+), 32 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 90e0137ff784..0f54f63136fd 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1077,13 +1077,10 @@ static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io, { WARN_ON_ONCE(io->flags & UBLK_IO_FLAG_ACTIVE); - if (!(io->flags & UBLK_IO_FLAG_ABORTED)) { - io->flags |= UBLK_IO_FLAG_ABORTED; - if (ublk_queue_can_use_recovery_reissue(ubq)) - blk_mq_requeue_request(req, false); - else - ublk_put_req_ref(ubq, req); - } + if (ublk_queue_can_use_recovery_reissue(ubq)) + blk_mq_requeue_request(req, false); + else + ublk_put_req_ref(ubq, req); } static void ubq_complete_io_cmd(struct ublk_io *io, int res, @@ -1221,30 +1218,12 @@ static void ublk_rq_task_work_cb(struct io_uring_cmd *cmd, unsigned issue_flags) ublk_forward_io_cmds(ubq, issue_flags); } -static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq) +static inline void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq) { struct ublk_rq_data *data = blk_mq_rq_to_pdu(rq); - struct ublk_io *io; - - if (!llist_add(&data->node, &ubq->io_cmds)) - return; - io = &ubq->ios[rq->tag]; - /* - * If the check pass, we know that this is a re-issued request aborted - * previously in exit notifier because the ubq_daemon(cmd's task) is - * PF_EXITING. We cannot call io_uring_cmd_complete_in_task() anymore - * because this ioucmd's io_uring context may be freed now if no inflight - * ioucmd exists. Otherwise we may cause null-deref in ctx->fallback_work. - * - * Note: exit notifier sets UBLK_IO_FLAG_ABORTED and ends this request(releasing - * the tag). Then the request is re-started(allocating the tag) and we are here. - * Since releasing/allocating a tag implies smp_mb(), finding UBLK_IO_FLAG_ABORTED - * guarantees that here is a re-issued request aborted previously. - */ - if (unlikely(io->flags & UBLK_IO_FLAG_ABORTED)) { - ublk_abort_io_cmds(ubq); - } else { + if (llist_add(&data->node, &ubq->io_cmds)) { + struct ublk_io *io = &ubq->ios[rq->tag]; struct io_uring_cmd *cmd = io->cmd; struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd); @@ -1274,6 +1253,7 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx, { struct ublk_queue *ubq = hctx->driver_data; struct request *rq = bd->rq; + struct ublk_io *io = &ubq->ios[rq->tag]; blk_status_t res; /* fill iod to slot in io cmd buffer */ @@ -1293,13 +1273,12 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx, if (ublk_queue_can_use_recovery(ubq) && unlikely(ubq->force_abort)) return BLK_STS_IOERR; - blk_mq_start_request(bd->rq); - - if (unlikely(ubq_daemon_is_dying(ubq))) { + if (unlikely(io->flags & UBLK_IO_FLAG_ABORTED)) { __ublk_abort_rq(ubq, rq); return BLK_STS_OK; } + blk_mq_start_request(bd->rq); ublk_queue_cmd(ubq, rq); return BLK_STS_OK; @@ -1429,6 +1408,9 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) for (i = 0; i < ubq->q_depth; i++) { struct ublk_io *io = &ubq->ios[i]; + if (!(io->flags & UBLK_IO_FLAG_ABORTED)) + io->flags |= UBLK_IO_FLAG_ABORTED; + if (!(io->flags & UBLK_IO_FLAG_ACTIVE)) { struct request *rq; @@ -1437,7 +1419,7 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) * will do it */ rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i); - if (rq) + if (rq && blk_mq_request_started(rq)) __ublk_fail_req(ubq, io, rq); } }