From patchwork Mon Sep 18 04:10:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388769 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72BE6CD13D2 for ; Mon, 18 Sep 2023 04:12:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234674AbjIREMZ (ORCPT ); Mon, 18 Sep 2023 00:12:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239414AbjIREMT (ORCPT ); Mon, 18 Sep 2023 00:12:19 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71BB29F for ; Sun, 17 Sep 2023 21:11:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010286; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JQ0OGbw8P5+mlbOCGhQ5rG8/PVFXLlF0EMXrQsKOArk=; b=SeRhByqKFQfQfwCAgIn9ASMTw6xlT6JmR30isoMj7F+S1sSy6QVAET4JKPiFkyDE6F3XYj vV21WnltXWGyGPgKFx39hlAHHKX4sNuvSjWNKOYipj88cplRCMNQPlu435FL0kWqjYklXM 2eZIUNSCm6yvgrLaRCfy8rwwQoKTkmM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-695--ILItjkONd2otM7iyI_-Sw-1; Mon, 18 Sep 2023 00:11:23 -0400 X-MC-Unique: -ILItjkONd2otM7iyI_-Sw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C76DC85A5A8; Mon, 18 Sep 2023 04:11:22 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id E09D21005B90; Mon, 18 Sep 2023 04:11:21 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 01/10] io_uring: allocate ctx id and build map between id and ctx Date: Mon, 18 Sep 2023 12:10:57 +0800 Message-Id: <20230918041106.2134250-2-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Prepare for supporting to notify uring_cmd driver when ctx/io_uring_task is going away. Notifier callback will be registered by driver to get notified, so that driver can cancel in-flight command which may depend on the io task. For driver to check if the ctx is matched with uring_cmd, allocate/provide ctx id to the callback, so we can avoid to expose the whole ctx instance. The global xarray of ctx_ids is added for holding the mapping and allocating unique id for each ctx. Signed-off-by: Ming Lei --- include/linux/io_uring.h | 2 ++ include/linux/io_uring_types.h | 3 +++ io_uring/io_uring.c | 9 +++++++++ 3 files changed, 14 insertions(+) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 106cdc55ff3b..ec9714e36477 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -41,6 +41,8 @@ static inline const void *io_uring_sqe_cmd(const struct io_uring_sqe *sqe) return sqe->cmd; } +#define IO_URING_INVALID_CTX_ID UINT_MAX + #if defined(CONFIG_IO_URING) int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, struct iov_iter *iter, void *ioucmd); diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 13d19b9be9f4..d310bb073101 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -215,6 +215,9 @@ struct io_ring_ctx { struct percpu_ref refs; enum task_work_notify_mode notify_method; + + /* for uring cmd driver to retrieve context */ + unsigned int id; } ____cacheline_aligned_in_smp; /* submission data */ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 783ed0fff71b..c015c070ff85 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -175,6 +175,9 @@ static struct ctl_table kernel_io_uring_disabled_table[] = { }; #endif +/* mapping between io_ring_ctx instance and its ctx_id */ +static DEFINE_XARRAY_FLAGS(ctx_ids, XA_FLAGS_ALLOC); + struct sock *io_uring_get_socket(struct file *file) { #if defined(CONFIG_UNIX) @@ -303,6 +306,10 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) xa_init(&ctx->io_bl_xa); + ctx->id = IO_URING_INVALID_CTX_ID; + if (xa_alloc(&ctx_ids, &ctx->id, ctx, xa_limit_31b, GFP_KERNEL)) + goto err; + /* * Use 5 bits less than the max cq entries, that should give us around * 32 entries per hash list if totally full and uniformly spread, but @@ -356,6 +363,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) kfree(ctx->cancel_table_locked.hbs); kfree(ctx->io_bl); xa_destroy(&ctx->io_bl_xa); + xa_erase(&ctx_ids, ctx->id); kfree(ctx); return NULL; } @@ -2929,6 +2937,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) kfree(ctx->cancel_table_locked.hbs); kfree(ctx->io_bl); xa_destroy(&ctx->io_bl_xa); + xa_erase(&ctx_ids, ctx->id); kfree(ctx); } From patchwork Mon Sep 18 04:10:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAB86CD13D3 for ; Mon, 18 Sep 2023 04:12:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230360AbjIREMZ (ORCPT ); Mon, 18 Sep 2023 00:12:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238654AbjIREMR (ORCPT ); Mon, 18 Sep 2023 00:12:17 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED36811A for ; Sun, 17 Sep 2023 21:11:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cPWT02s2CM706fOFGqrBFd6kGdqjwMylb/OMZksDA8s=; b=g8mIEmbv1yPTEQ4hAx5LNGgMQyst/XBTpvTGQ0AzdyeycHqB0hhDcttvQ3Yzz0mahLfBOc GjTcrjiLX5r6KiiInFE2yxyoGdPNhJHDOuscHkfSTfYa4B+kvUWHVx5IoTok42HemLQ+bp CfQkWIQPewgOkCuTxfd7V0RgU62qDbk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-97-TtqFWRStO_mLqTTwQlvOXg-1; Mon, 18 Sep 2023 00:11:27 -0400 X-MC-Unique: TtqFWRStO_mLqTTwQlvOXg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 687E0801779; Mon, 18 Sep 2023 04:11:27 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id 89D5640C2064; Mon, 18 Sep 2023 04:11:25 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 02/10] io_uring: pass io_uring_ctx->id to uring_cmd Date: Mon, 18 Sep 2023 12:10:58 +0800 Message-Id: <20230918041106.2134250-3-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Pass io_uring_ctx-id to uring_cmd driver, and prepare for supporting io_uring ctx or task exit notifier. Signed-off-by: Ming Lei --- include/linux/io_uring.h | 6 +++++- io_uring/uring_cmd.c | 1 + 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index ec9714e36477..c395807bd7cf 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -33,7 +33,11 @@ struct io_uring_cmd { }; u32 cmd_op; u32 flags; - u8 pdu[32]; /* available inline for free use */ + union { + /* driver needs to save ctx_id */ + u32 ctx_id; + u8 pdu[32]; /* available inline for free use */ + }; }; static inline const void *io_uring_sqe_cmd(const struct io_uring_sqe *sqe) diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 537795fddc87..c54c627fb6b9 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -105,6 +105,7 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) req->imu = ctx->user_bufs[index]; io_req_set_rsrc_node(req, ctx, 0); } + ioucmd->ctx_id = req->ctx->id; ioucmd->sqe = sqe; ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); return 0; From patchwork Mon Sep 18 04:10:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67AA7CD13D3 for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239434AbjIREN3 (ORCPT ); Mon, 18 Sep 2023 00:13:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239414AbjIREM7 (ORCPT ); Mon, 18 Sep 2023 00:12:59 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4DC18122 for ; Sun, 17 Sep 2023 21:11:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010300; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=52B/d8oY7jJWZfrym56R0BHRAS7KgEfK4SZXaam5tP4=; b=BTWZH6imWJU6fJ5zsx66Vwd5JkgqNwE+7ZQhqn4uatXIYjv6tpK8iaCP3+LkA1R5N/czsD bCluaAYxKfKNV6ZZAH2ZbfiEYd8rZ7Elvgzc736BtP8VTG1T2DUzjld9cM3D3sYLHP15h1 s5/l+UT5pe01GO5NpWZHA6YPiRf4NR0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-570-hi4SGSw2M5e8G1WXfau-Rg-1; Mon, 18 Sep 2023 00:11:32 -0400 X-MC-Unique: hi4SGSw2M5e8G1WXfau-Rg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E4181811E7B; Mon, 18 Sep 2023 04:11:31 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id E6EB6176C3; Mon, 18 Sep 2023 04:11:30 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 03/10] io_uring: support io_uring notifier for uring_cmd Date: Mon, 18 Sep 2023 12:10:59 +0800 Message-Id: <20230918041106.2134250-4-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Notifier callback is registered by driver to get notified, so far only for uring_cmd based driver. With this notifier, driver can cancel in-flight command when ctx is being released or io task is exiting. The main use case is ublk(or probably fuse with uring cmd support) in which uring command may never complete, so driver has to cancel this command when io task is exiting or ctx is releasing, otherwise __io_uring_cancel() may wait forever because of inflight commands. Signed-off-by: Ming Lei --- include/linux/io_uring.h | 19 ++++++++++++++++ io_uring/io_uring.c | 48 ++++++++++++++++++++++++++++++++++++++++ io_uring/io_uring.h | 4 ++++ io_uring/uring_cmd.c | 12 ++++++++++ 4 files changed, 83 insertions(+) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index c395807bd7cf..037bff9960a1 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -47,7 +47,19 @@ static inline const void *io_uring_sqe_cmd(const struct io_uring_sqe *sqe) #define IO_URING_INVALID_CTX_ID UINT_MAX +enum io_uring_notifier { + IO_URING_NOTIFIER_CTX_EXIT, + IO_URING_NOTIFIER_IO_TASK_EXIT, +}; + +struct io_uring_notifier_data { + unsigned int ctx_id; + const struct task_struct *task; +}; + #if defined(CONFIG_IO_URING) +int io_uring_cmd_register_notifier(struct notifier_block *nb); +void io_uring_cmd_unregister_notifier(struct notifier_block *nb); int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, struct iov_iter *iter, void *ioucmd); void io_uring_cmd_done(struct io_uring_cmd *cmd, ssize_t ret, ssize_t res2, @@ -89,6 +101,13 @@ static inline void io_uring_free(struct task_struct *tsk) } int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags); #else +static inline int io_uring_cmd_register_notifier(struct notifier_block *nb) +{ + return 0; +} +static inline void io_uring_cmd_unregister_notifier(struct notifier_block *nb) +{ +} static inline int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, struct iov_iter *iter, void *ioucmd) { diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index c015c070ff85..de9b217bf5d8 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -73,6 +73,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -178,6 +179,22 @@ static struct ctl_table kernel_io_uring_disabled_table[] = { /* mapping between io_ring_ctx instance and its ctx_id */ static DEFINE_XARRAY_FLAGS(ctx_ids, XA_FLAGS_ALLOC); +/* + * Uring_cmd driver can register to be notified when ctx/io_uring_task + * is going away for canceling inflight commands. + */ +static struct srcu_notifier_head notifier_chain; + +int io_uring_register_notifier(struct notifier_block *nb) +{ + return srcu_notifier_chain_register(¬ifier_chain, nb); +} + +void io_uring_unregister_notifier(struct notifier_block *nb) +{ + srcu_notifier_chain_unregister(¬ifier_chain, nb); +} + struct sock *io_uring_get_socket(struct file *file) { #if defined(CONFIG_UNIX) @@ -191,6 +208,11 @@ struct sock *io_uring_get_socket(struct file *file) } EXPORT_SYMBOL(io_uring_get_socket); +struct io_ring_ctx *io_uring_id_to_ctx(unsigned int id) +{ + return (struct io_ring_ctx *)xa_load(&ctx_ids, id); +} + static inline void io_submit_flush_completions(struct io_ring_ctx *ctx) { if (!wq_list_empty(&ctx->submit_state.compl_reqs) || @@ -3060,6 +3082,23 @@ static __cold bool io_cancel_ctx_cb(struct io_wq_work *work, void *data) return req->ctx == data; } +static __cold void io_uring_cancel_notify(struct io_ring_ctx *ctx, + struct task_struct *task) +{ + struct io_uring_notifier_data notifier_data = { + .ctx_id = ctx->id, + .task = task, + }; + enum io_uring_notifier notifier; + + if (!task) + notifier = IO_URING_NOTIFIER_CTX_EXIT; + else + notifier = IO_URING_NOTIFIER_IO_TASK_EXIT; + + srcu_notifier_call_chain(¬ifier_chain, notifier, ¬ifier_data); +} + static __cold void io_ring_exit_work(struct work_struct *work) { struct io_ring_ctx *ctx = container_of(work, struct io_ring_ctx, exit_work); @@ -3069,6 +3108,8 @@ static __cold void io_ring_exit_work(struct work_struct *work) struct io_tctx_node *node; int ret; + io_uring_cancel_notify(ctx, NULL); + /* * If we're doing polled IO and end up having requests being * submitted async (out-of-line), then completions can come in while @@ -3346,6 +3387,11 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) if (tctx->io_wq) io_wq_exit_start(tctx->io_wq); + if (!cancel_all) { + xa_for_each(&tctx->xa, index, node) + io_uring_cancel_notify(node->ctx, current); + } + atomic_inc(&tctx->in_cancel); do { bool loop = false; @@ -4695,6 +4741,8 @@ static int __init io_uring_init(void) register_sysctl_init("kernel", kernel_io_uring_disabled_table); #endif + srcu_init_notifier_head(¬ifier_chain); + return 0; }; __initcall(io_uring_init); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 547c30582fb8..1d5588d8a88a 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -38,6 +38,10 @@ enum { IOU_STOP_MULTISHOT = -ECANCELED, }; +struct io_ring_ctx *io_uring_id_to_ctx(unsigned int id); +int io_uring_register_notifier(struct notifier_block *nb); +void io_uring_unregister_notifier(struct notifier_block *nb); + bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow); void io_req_cqe_overflow(struct io_kiocb *req); int io_run_task_work_sig(struct io_ring_ctx *ctx); diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index c54c627fb6b9..03e3a8c1b712 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -192,3 +192,15 @@ int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags) } } EXPORT_SYMBOL_GPL(io_uring_cmd_sock); + +int io_uring_cmd_register_notifier(struct notifier_block *nb) +{ + return io_uring_register_notifier(nb); +} +EXPORT_SYMBOL_GPL(io_uring_cmd_register_notifier); + +void io_uring_cmd_unregister_notifier(struct notifier_block *nb) +{ + io_uring_unregister_notifier(nb); +} +EXPORT_SYMBOL_GPL(io_uring_cmd_unregister_notifier); From patchwork Mon Sep 18 04:11:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388772 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42CD5CD13D1 for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239462AbjIREN2 (ORCPT ); Mon, 18 Sep 2023 00:13:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239221AbjIREM6 (ORCPT ); Mon, 18 Sep 2023 00:12:58 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C582121 for ; Sun, 17 Sep 2023 21:11:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010300; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ek6c8oKP7WrCCE1Di4TBl2jO8h2lwVhBOJLN+tp+jLk=; b=JNstWIoCi8dQ3DBdbO+ioMgshhXDxaN7lM6glEXtQTXkvNoeEd7a+cHHlJTXNtPKP87812 GkhVYBP9lDTxn/YwbHOy4etBSIQOlU+UVTr7A6Q5us4diCBrIHDldrM/kTvi0Gqu494lHa ejQ+RecFoop4yXvmPVDviVJC0n2o6E0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-528-0tFN9_eMNoWyqQusowO1og-1; Mon, 18 Sep 2023 00:11:36 -0400 X-MC-Unique: 0tFN9_eMNoWyqQusowO1og-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8FF38945922; Mon, 18 Sep 2023 04:11:36 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id 963F02026D4B; Mon, 18 Sep 2023 04:11:35 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 04/10] ublk: don't get ublk device reference in ublk_abort_queue() Date: Mon, 18 Sep 2023 12:11:00 +0800 Message-Id: <20230918041106.2134250-5-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org ublk_abort_queue() is called in ublk_daemon_monitor_work(), in which it is guaranteed that ublk device is live, so no need to get the device reference. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 630ddfe6657b..9b3c0b3dd36e 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1419,9 +1419,6 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) { int i; - if (!ublk_get_device(ub)) - return; - for (i = 0; i < ubq->q_depth; i++) { struct ublk_io *io = &ubq->ios[i]; @@ -1437,7 +1434,6 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) __ublk_fail_req(ubq, io, rq); } } - ublk_put_device(ub); } static void ublk_daemon_monitor_work(struct work_struct *work) From patchwork Mon Sep 18 04:11:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388774 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E746CD13D9 for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239221AbjIREN3 (ORCPT ); Mon, 18 Sep 2023 00:13:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238654AbjIREM6 (ORCPT ); Mon, 18 Sep 2023 00:12:58 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF3C7126 for ; Sun, 17 Sep 2023 21:11:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010303; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+D1tdoJY1LNEs7CO9yqwlRl9HUhkTckA0MTM2Mzt9+U=; b=EeWKGXSvC6AMWc80+Z9d4kKV2V35pxWKucM65Sr/SanfRenQMne5b7n4JELIyW0uaCtD8J 5HP5zQ4x7rb23GyXNp0PP8uU/1jTReyrbHf6aKEJhjARH6AKR6dM2Lo1kMrLKgBLUmILy8 3BiKWoZuQ3dLWac/ELOFjxoqLT3S7UQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-511-8mnrePgoPseT-tlH8xPF_A-1; Mon, 18 Sep 2023 00:11:40 -0400 X-MC-Unique: 8mnrePgoPseT-tlH8xPF_A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 775E585A5BA; Mon, 18 Sep 2023 04:11:40 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8AF9240C6EA8; Mon, 18 Sep 2023 04:11:38 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 05/10] ublk: make sure ublk uring cmd handling is done in submitter task context Date: Mon, 18 Sep 2023 12:11:01 +0800 Message-Id: <20230918041106.2134250-6-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org In well-done ublk server implementation, ublk io command won't be linked into any link chain. Meantime we are always handled in no-wait style, so basically ublk uring cmd handling is always done in submitter task context. However, the server may set IOSQE_ASYNC, or io command is linked to one chain mistakenly, then we may still run into io-wq context and ctx->uring_lock isn't held. So in case of IO_URING_F_UNLOCKED, schedule this command by io_uring_cmd_complete_in_task to run it in submitter task. Then ublk_ch_uring_cmd_local() is always called with context uring_lock held, and we needn't to worry about sync among submission code path. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 9b3c0b3dd36e..46d499d96ca3 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1810,7 +1810,8 @@ static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub, return NULL; } -static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) +static inline int ublk_ch_uring_cmd_local(struct io_uring_cmd *cmd, + unsigned int issue_flags) { /* * Not necessary for async retry, but let's keep it simple and always @@ -1824,9 +1825,28 @@ static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) .addr = READ_ONCE(ub_src->addr) }; + WARN_ON_ONCE(issue_flags & IO_URING_F_UNLOCKED); + return __ublk_ch_uring_cmd(cmd, issue_flags, &ub_cmd); } +static void ublk_ch_uring_cmd_cb(struct io_uring_cmd *cmd, + unsigned int issue_flags) +{ + ublk_ch_uring_cmd_local(cmd, issue_flags); +} + +static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) +{ + /* well-implemented server won't run into unlocked */ + if (unlikely(issue_flags & IO_URING_F_UNLOCKED)) { + io_uring_cmd_complete_in_task(cmd, ublk_ch_uring_cmd_cb); + return -EIOCBQUEUED; + } + + return ublk_ch_uring_cmd_local(cmd, issue_flags); +} + static inline bool ublk_check_ubuf_dir(const struct request *req, int ubuf_dir) { From patchwork Mon Sep 18 04:11:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 296B4CD3424 for ; Mon, 18 Sep 2023 04:13:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239486AbjIRENc (ORCPT ); Mon, 18 Sep 2023 00:13:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239521AbjIRENF (ORCPT ); Mon, 18 Sep 2023 00:13:05 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23C5912A for ; Sun, 17 Sep 2023 21:11:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010307; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9YlbJwv7tXUgE/3jOxnqej4cCUaduO2XvxGilksrAig=; b=EFyPy+lFHdjLSWtn9GK5rUqVz5Ebp/TFPE01nqd+bGpXgq7BV+/8MbO97PE4/6hlA30YtB Ud8TxQQAFdSil/LuQuObFLU9xL5i1WCfJi7TbY97y1EBS2adcE33YYhCTfCLZn/a4B89ZX yMSruJWOxMiD+VDVcCTzNZhAKU1zoFA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-418-22Vb2Q6dMM2L7W6lowtD_g-1; Mon, 18 Sep 2023 00:11:45 -0400 X-MC-Unique: 22Vb2Q6dMM2L7W6lowtD_g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E03F3185A790; Mon, 18 Sep 2023 04:11:44 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id F118B40C6EA8; Mon, 18 Sep 2023 04:11:43 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 06/10] ublk: make sure that uring cmd aiming at same queue won't cross io_uring contexts Date: Mon, 18 Sep 2023 12:11:02 +0800 Message-Id: <20230918041106.2134250-7-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Make sure that all commands aiming at same ublk queue are from same io_uring context. This way is one very reasonable requirement, and not see any reason userspace may send uring cmd to same queue by multiple io_uring contexts. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 46d499d96ca3..52dd53662ffb 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -131,6 +131,7 @@ struct ublk_queue { unsigned long flags; struct task_struct *ubq_daemon; char *io_cmd_buf; + unsigned int ctx_id; struct llist_head io_cmds; @@ -1410,6 +1411,11 @@ static void ublk_commit_completion(struct ublk_device *ub, ublk_put_req_ref(ubq, req); } +static inline bool ublk_ctx_id_is_valid(unsigned int ctx_id) +{ + return ctx_id != IO_URING_INVALID_CTX_ID; +} + /* * When ->ubq_daemon is exiting, either new request is ended immediately, * or any queued io command is drained, so it is safe to abort queue @@ -1609,11 +1615,13 @@ static void ublk_stop_dev(struct ublk_device *ub) } /* device can only be started after all IOs are ready */ -static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq) +static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq, + unsigned int ctx_id) { mutex_lock(&ub->mutex); ubq->nr_io_ready++; if (ublk_queue_ready(ubq)) { + ubq->ctx_id = ctx_id; ubq->ubq_daemon = current; get_task_struct(ubq->ubq_daemon); ub->nr_queues_ready++; @@ -1682,6 +1690,9 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, if (ubq->ubq_daemon && ubq->ubq_daemon != current) goto out; + if (ublk_ctx_id_is_valid(ubq->ctx_id) && cmd->ctx_id != ubq->ctx_id) + goto out; + if (tag >= ubq->q_depth) goto out; @@ -1734,7 +1745,7 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, } ublk_fill_io_cmd(io, cmd, ub_cmd->addr); - ublk_mark_io_ready(ub, ubq); + ublk_mark_io_ready(ub, ubq, cmd->ctx_id); break; case UBLK_IO_COMMIT_AND_FETCH_REQ: req = blk_mq_tag_to_rq(ub->tag_set.tags[ub_cmd->q_id], tag); @@ -1989,6 +2000,7 @@ static int ublk_init_queue(struct ublk_device *ub, int q_id) ubq->io_cmd_buf = ptr; ubq->dev = ub; + ubq->ctx_id = IO_URING_INVALID_CTX_ID; return 0; } @@ -2593,6 +2605,8 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq) ubq->ubq_daemon = NULL; ubq->timeout = false; + ubq->ctx_id = IO_URING_INVALID_CTX_ID; + for (i = 0; i < ubq->q_depth; i++) { struct ublk_io *io = &ubq->ios[i]; From patchwork Mon Sep 18 04:11:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388778 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B96CCD3422 for ; Mon, 18 Sep 2023 04:13:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239498AbjIRENc (ORCPT ); Mon, 18 Sep 2023 00:13:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239534AbjIRENH (ORCPT ); Mon, 18 Sep 2023 00:13:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DAD512D for ; Sun, 17 Sep 2023 21:11:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010310; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jFn9Cdtk+7JZBpIhDNO+U1L5R3ZpxdukDn8LrFmDXLI=; b=hZJZ5F62/YM9k+Mb4fv+4j6yEET8NUk4hOZxYeiZzuRKRGP57+ct5Q4S5l1cESElhsrzFs cf5BdFpeXRK3bzl6CzbA2YJL9uZj2HcJ2Ld2R2/3aDhzqFfMQg7vyTNcghr9bpRmWKE4Dy q7f9MlUxLXcSodleNutPy9Q2K9X6bqs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-153-uPS8f0-MMn-QkJx2tfOKyQ-1; Mon, 18 Sep 2023 00:11:49 -0400 X-MC-Unique: uPS8f0-MMn-QkJx2tfOKyQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BE896185A78E; Mon, 18 Sep 2023 04:11:48 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBD9A2156701; Mon, 18 Sep 2023 04:11:47 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 07/10] ublk: rename mm_lock as lock Date: Mon, 18 Sep 2023 12:11:03 +0800 Message-Id: <20230918041106.2134250-8-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Rename the mm_lock field of ublk_device as lock, so that this lock can be reused for protecting access of ub->ub_disk, which will be used for simplifying ublk_abort_queue() by quiesce queue in next patch. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 52dd53662ffb..4bc4c4f87b36 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -167,7 +167,7 @@ struct ublk_device { struct mutex mutex; - spinlock_t mm_lock; + spinlock_t lock; struct mm_struct *mm; struct ublk_params params; @@ -1358,12 +1358,12 @@ static int ublk_ch_mmap(struct file *filp, struct vm_area_struct *vma) unsigned long pfn, end, phys_off = vma->vm_pgoff << PAGE_SHIFT; int q_id, ret = 0; - spin_lock(&ub->mm_lock); + spin_lock(&ub->lock); if (!ub->mm) ub->mm = current->mm; if (current->mm != ub->mm) ret = -EINVAL; - spin_unlock(&ub->mm_lock); + spin_unlock(&ub->lock); if (ret) return ret; @@ -2348,7 +2348,7 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) if (!ub) goto out_unlock; mutex_init(&ub->mutex); - spin_lock_init(&ub->mm_lock); + spin_lock_init(&ub->lock); INIT_WORK(&ub->quiesce_work, ublk_quiesce_work_fn); INIT_WORK(&ub->stop_work, ublk_stop_work_fn); INIT_DELAYED_WORK(&ub->monitor_work, ublk_daemon_monitor_work); From patchwork Mon Sep 18 04:11:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DEF6CD13D8 for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239482AbjIRENa (ORCPT ); Mon, 18 Sep 2023 00:13:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239485AbjIRENC (ORCPT ); Mon, 18 Sep 2023 00:13:02 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56F71132 for ; Sun, 17 Sep 2023 21:11:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010315; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e772wTQJHJwxBOXO8UBygjxubDYmCkNIw23G8H1LTp0=; b=Y6zOXeW+2QR/SM1NIDD5omrywsE50szG6SPkNVBZNNbbZNpiGz4LWo+qDEbdbdZtZ12GmD 0wIOtzg/aALog6FCa4RkcZK6jU0QEZpGGt/uFdyO4iCWBibO/EJOVN0fXwflzni5OSb35c NY11/+Zf9zmTCsjpVe5CNqgCXGgHyD4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-543-AW4f9IqIOCWV_UvVVcyz8A-1; Mon, 18 Sep 2023 00:11:53 -0400 X-MC-Unique: AW4f9IqIOCWV_UvVVcyz8A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E877C85A5BA; Mon, 18 Sep 2023 04:11:52 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id F058740C6EA8; Mon, 18 Sep 2023 04:11:51 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 08/10] ublk: quiesce request queue when aborting queue Date: Mon, 18 Sep 2023 12:11:04 +0800 Message-Id: <20230918041106.2134250-9-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org So far aborting queue ends request when the ubq daemon is exiting, and it can be run concurrently with ublk_queue_rq(), this way is fragile and we depend on the tricky usage of UBLK_IO_FLAG_ABORTED for avoiding such race. Quiesce queue when aborting queue, and the two code paths can be run completely exclusively, then it becomes easier to add new ublk feature, such as relaxing single same task limit for each queue. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 43 ++++++++++++++++++++++++++++++++++------ 1 file changed, 37 insertions(+), 6 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 4bc4c4f87b36..3b691bf3d9ef 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1446,21 +1446,45 @@ static void ublk_daemon_monitor_work(struct work_struct *work) { struct ublk_device *ub = container_of(work, struct ublk_device, monitor_work.work); + struct gendisk *disk; int i; for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { struct ublk_queue *ubq = ublk_get_queue(ub, i); - if (ubq_daemon_is_dying(ubq)) { - if (ublk_queue_can_use_recovery(ubq)) - schedule_work(&ub->quiesce_work); - else - schedule_work(&ub->stop_work); + if (ubq_daemon_is_dying(ubq)) + goto found; + } + return; + +found: + spin_lock(&ub->lock); + disk = ub->ub_disk; + if (disk) + get_device(disk_to_dev(disk)); + spin_unlock(&ub->lock); + + /* Our disk has been dead */ + if (!disk) + return; + + /* Now we are serialized with ublk_queue_rq() */ + blk_mq_quiesce_queue(disk->queue); + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { + struct ublk_queue *ubq = ublk_get_queue(ub, i); + if (ubq_daemon_is_dying(ubq)) { /* abort queue is for making forward progress */ ublk_abort_queue(ub, ubq); } } + blk_mq_unquiesce_queue(disk->queue); + put_device(disk_to_dev(disk)); + + if (ublk_can_use_recovery(ub)) + schedule_work(&ub->quiesce_work); + else + schedule_work(&ub->stop_work); /* * We can't schedule monitor work after ub's state is not UBLK_S_DEV_LIVE. @@ -1595,6 +1619,8 @@ static void ublk_unquiesce_dev(struct ublk_device *ub) static void ublk_stop_dev(struct ublk_device *ub) { + struct gendisk *disk; + mutex_lock(&ub->mutex); if (ub->dev_info.state == UBLK_S_DEV_DEAD) goto unlock; @@ -1604,10 +1630,15 @@ static void ublk_stop_dev(struct ublk_device *ub) ublk_unquiesce_dev(ub); } del_gendisk(ub->ub_disk); + + /* Sync with ublk_abort_queue() by holding the lock */ + spin_lock(&ub->lock); + disk = ub->ub_disk; ub->dev_info.state = UBLK_S_DEV_DEAD; ub->dev_info.ublksrv_pid = -1; - put_disk(ub->ub_disk); ub->ub_disk = NULL; + spin_unlock(&ub->lock); + put_disk(disk); unlock: ublk_cancel_dev(ub); mutex_unlock(&ub->mutex); From patchwork Mon Sep 18 04:11:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD1C9CD13DA for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239485AbjIRENb (ORCPT ); Mon, 18 Sep 2023 00:13:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239498AbjIRENE (ORCPT ); Mon, 18 Sep 2023 00:13:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35A45133 for ; Sun, 17 Sep 2023 21:12:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010320; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YRG0DD08DVTXjsynnmBGtcrnrZop7rwgneNtLdOiHFs=; b=Gv1U4Gw2Iwbi0gXXrYCqrp1Po59w0PreZirarpS6epKqOdZpgDhkQxqcdJ+bscuNZSQ1sY VyUhsYqfZJ3jCLQpejdcon8foBABrRNq3LyaiLa64tgflgOEtlOKrmKX/s5QuN/itV80JS H1/bu1qGy6uzjFm3GByBW7WxHqRH7gc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-562-rwabaOyeOm-jyuK2eAP3KQ-1; Mon, 18 Sep 2023 00:11:57 -0400 X-MC-Unique: rwabaOyeOm-jyuK2eAP3KQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C90A51875054; Mon, 18 Sep 2023 04:11:56 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id BA9BB10F1BE9; Mon, 18 Sep 2023 04:11:55 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 09/10] ublk: replace monitor work with uring_cmd exit notifier Date: Mon, 18 Sep 2023 12:11:05 +0800 Message-Id: <20230918041106.2134250-10-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Monitor work actually introduces one extra context for handling abort, this way is easy to cause race, and also introduce extra delay when handling aborting. Now uring_cmd introduces exit notifier, so use it instead: 1) this notifier callback is either run from the uring cmd submission task context or called after the io_uring context is exit, so the callback is run exclusively with ublk_ch_uring_cmd() and __ublk_rq_task_work(). 2) the previous patch freezes request queue when calling ublk_abort_queue(), which is now completely exclusive with ublk_queue_rq() and ublk_ch_uring_cmd()/__ublk_rq_task_work(). This way simplifies aborting queue, and is helpful for adding new feature, such as, relax the limit of using single task for handling one queue. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 89 +++++++++++++++++++--------------------- 1 file changed, 42 insertions(+), 47 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 3b691bf3d9ef..90e0137ff784 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -144,8 +144,6 @@ struct ublk_queue { struct ublk_io ios[]; }; -#define UBLK_DAEMON_MONITOR_PERIOD (5 * HZ) - struct ublk_device { struct gendisk *ub_disk; @@ -176,11 +174,7 @@ struct ublk_device { unsigned int nr_queues_ready; unsigned int nr_privileged_daemon; - /* - * Our ubq->daemon may be killed without any notification, so - * monitor each queue's daemon periodically - */ - struct delayed_work monitor_work; + struct notifier_block notif; struct work_struct quiesce_work; struct work_struct stop_work; }; @@ -194,7 +188,6 @@ struct ublk_params_header { static inline unsigned int ublk_req_build_flags(struct request *req); static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq, int tag); - static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub) { return ub->dev_info.flags & UBLK_F_USER_COPY; @@ -1119,8 +1112,6 @@ static inline void __ublk_abort_rq(struct ublk_queue *ubq, blk_mq_requeue_request(rq, false); else blk_mq_end_request(rq, BLK_STS_IOERR); - - mod_delayed_work(system_wq, &ubq->dev->monitor_work, 0); } static inline void __ublk_rq_task_work(struct request *req, @@ -1241,12 +1232,12 @@ static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq) io = &ubq->ios[rq->tag]; /* * If the check pass, we know that this is a re-issued request aborted - * previously in monitor_work because the ubq_daemon(cmd's task) is + * previously in exit notifier because the ubq_daemon(cmd's task) is * PF_EXITING. We cannot call io_uring_cmd_complete_in_task() anymore * because this ioucmd's io_uring context may be freed now if no inflight * ioucmd exists. Otherwise we may cause null-deref in ctx->fallback_work. * - * Note: monitor_work sets UBLK_IO_FLAG_ABORTED and ends this request(releasing + * Note: exit notifier sets UBLK_IO_FLAG_ABORTED and ends this request(releasing * the tag). Then the request is re-started(allocating the tag) and we are here. * Since releasing/allocating a tag implies smp_mb(), finding UBLK_IO_FLAG_ABORTED * guarantees that here is a re-issued request aborted previously. @@ -1334,9 +1325,17 @@ static int ublk_ch_open(struct inode *inode, struct file *filp) { struct ublk_device *ub = container_of(inode->i_cdev, struct ublk_device, cdev); + int ret; if (test_and_set_bit(UB_STATE_OPEN, &ub->state)) return -EBUSY; + + ret = io_uring_cmd_register_notifier(&ub->notif); + if (ret) { + clear_bit(UB_STATE_OPEN, &ub->state); + return ret; + } + filp->private_data = ub; return 0; } @@ -1346,6 +1345,8 @@ static int ublk_ch_release(struct inode *inode, struct file *filp) struct ublk_device *ub = filp->private_data; clear_bit(UB_STATE_OPEN, &ub->state); + io_uring_cmd_unregister_notifier(&ub->notif); + return 0; } @@ -1417,9 +1418,9 @@ static inline bool ublk_ctx_id_is_valid(unsigned int ctx_id) } /* - * When ->ubq_daemon is exiting, either new request is ended immediately, - * or any queued io command is drained, so it is safe to abort queue - * lockless + * Called from ubq_daemon context via the notifier, meantime quiesce ublk + * blk-mq queue, so we are called exclusively with blk-mq and ubq_daemon + * context, so everything is serialized. */ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) { @@ -1442,20 +1443,34 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) } } -static void ublk_daemon_monitor_work(struct work_struct *work) +static inline bool ubq_daemon_is_exiting(const struct ublk_queue *ubq, + const struct io_uring_notifier_data *data) { - struct ublk_device *ub = - container_of(work, struct ublk_device, monitor_work.work); + return data->ctx_id == ubq->ctx_id && (!data->task || + data->task == ubq->ubq_daemon); +} + +static int ublk_notifier_cb(struct notifier_block *nb, + unsigned long event, void *val) +{ + struct ublk_device *ub = container_of(nb, struct ublk_device, notif); + struct io_uring_notifier_data *data = val; struct gendisk *disk; int i; + pr_devel("%s: event %lu ctx_id %u task %p\n", __func__, event, + data->ctx_id, data->task); + + if (!ublk_ctx_id_is_valid(data->ctx_id)) + return 0; + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { struct ublk_queue *ubq = ublk_get_queue(ub, i); - if (ubq_daemon_is_dying(ubq)) + if (ubq_daemon_is_exiting(ubq, data)) goto found; } - return; + return 0; found: spin_lock(&ub->lock); @@ -1466,14 +1481,14 @@ static void ublk_daemon_monitor_work(struct work_struct *work) /* Our disk has been dead */ if (!disk) - return; + return 0; /* Now we are serialized with ublk_queue_rq() */ blk_mq_quiesce_queue(disk->queue); for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { struct ublk_queue *ubq = ublk_get_queue(ub, i); - if (ubq_daemon_is_dying(ubq)) { + if (ubq_daemon_is_exiting(ubq, data)) { /* abort queue is for making forward progress */ ublk_abort_queue(ub, ubq); } @@ -1485,17 +1500,7 @@ static void ublk_daemon_monitor_work(struct work_struct *work) schedule_work(&ub->quiesce_work); else schedule_work(&ub->stop_work); - - /* - * We can't schedule monitor work after ub's state is not UBLK_S_DEV_LIVE. - * after ublk_remove() or __ublk_quiesce_dev() is started. - * - * No need ub->mutex, monitor work are canceled after state is marked - * as not LIVE, so new state is observed reliably. - */ - if (ub->dev_info.state == UBLK_S_DEV_LIVE) - schedule_delayed_work(&ub->monitor_work, - UBLK_DAEMON_MONITOR_PERIOD); + return 0; } static inline bool ublk_queue_ready(struct ublk_queue *ubq) @@ -1512,6 +1517,9 @@ static void ublk_cancel_queue(struct ublk_queue *ubq) { int i; + if (ublk_ctx_id_is_valid(ubq->ctx_id)) + ubq->ctx_id = IO_URING_INVALID_CTX_ID; + if (!ublk_queue_ready(ubq)) return; @@ -1572,15 +1580,6 @@ static void __ublk_quiesce_dev(struct ublk_device *ub) ublk_wait_tagset_rqs_idle(ub); ub->dev_info.state = UBLK_S_DEV_QUIESCED; ublk_cancel_dev(ub); - /* we are going to release task_struct of ubq_daemon and resets - * ->ubq_daemon to NULL. So in monitor_work, check on ubq_daemon causes UAF. - * Besides, monitor_work is not necessary in QUIESCED state since we have - * already scheduled quiesce_work and quiesced all ubqs. - * - * Do not let monitor_work schedule itself if state it QUIESCED. And we cancel - * it here and re-schedule it in END_USER_RECOVERY to avoid UAF. - */ - cancel_delayed_work_sync(&ub->monitor_work); } static void ublk_quiesce_work_fn(struct work_struct *work) @@ -1642,7 +1641,6 @@ static void ublk_stop_dev(struct ublk_device *ub) unlock: ublk_cancel_dev(ub); mutex_unlock(&ub->mutex); - cancel_delayed_work_sync(&ub->monitor_work); } /* device can only be started after all IOs are ready */ @@ -2210,8 +2208,6 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd) if (wait_for_completion_interruptible(&ub->completion) != 0) return -EINTR; - schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD); - mutex_lock(&ub->mutex); if (ub->dev_info.state == UBLK_S_DEV_LIVE || test_bit(UB_STATE_USED, &ub->state)) { @@ -2382,7 +2378,7 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) spin_lock_init(&ub->lock); INIT_WORK(&ub->quiesce_work, ublk_quiesce_work_fn); INIT_WORK(&ub->stop_work, ublk_stop_work_fn); - INIT_DELAYED_WORK(&ub->monitor_work, ublk_daemon_monitor_work); + ub->notif.notifier_call = ublk_notifier_cb; ret = ublk_alloc_dev_number(ub, header->dev_id); if (ret < 0) @@ -2722,7 +2718,6 @@ static int ublk_ctrl_end_recovery(struct ublk_device *ub, __func__, header->dev_id); blk_mq_kick_requeue_list(ub->ub_disk->queue); ub->dev_info.state = UBLK_S_DEV_LIVE; - schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD); ret = 0; out_unlock: mutex_unlock(&ub->mutex); From patchwork Mon Sep 18 04:11:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 13388776 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB86CCD3420 for ; Mon, 18 Sep 2023 04:13:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239501AbjIRENd (ORCPT ); Mon, 18 Sep 2023 00:13:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239541AbjIRENJ (ORCPT ); Mon, 18 Sep 2023 00:13:09 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49DC5B6 for ; Sun, 17 Sep 2023 21:12:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695010328; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7H3F07jM47TzAxqTwg2x9XRWNY0CBRdRHslqoChDfbc=; b=iSzeCLWBHKcYjPdo50SkF0ke30telVICBjQCx2fn0gx6oZUZAjOyjoTVbjjrBA9tnUigwQ j0dX+YFo0HaR4GfHOQmzOLNvFNl58GMHMIuogkgN44mTXP01pXC7vLHjwNtBvd/yFJ/ncA Edy5IeNaDlgaGKxcLrmFI2apFgiWrDA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-692-Gxpd5jc0PA2L0sieXE1sbQ-1; Mon, 18 Sep 2023 00:12:02 -0400 X-MC-Unique: Gxpd5jc0PA2L0sieXE1sbQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B49741875045; Mon, 18 Sep 2023 04:12:01 +0000 (UTC) Received: from localhost (unknown [10.72.120.3]) by smtp.corp.redhat.com (Postfix) with ESMTP id A785B1006B6D; Mon, 18 Sep 2023 04:12:00 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: ZiyangZhang , Ming Lei Subject: [PATCH 10/10] ublk: simplify aborting request Date: Mon, 18 Sep 2023 12:11:06 +0800 Message-Id: <20230918041106.2134250-11-ming.lei@redhat.com> In-Reply-To: <20230918041106.2134250-1-ming.lei@redhat.com> References: <20230918041106.2134250-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Now ublk_abort_queue() is run exclusively with ublk_queue_rq() and the ubq_daemon task, so simplify aborting request: - set UBLK_IO_FLAG_ABORTED in ublk_abort_queue() - abort request in ublk_queue_rq() if UBLK_IO_FLAG_ABORTED is observed, and no need to check ubq_daemon any more Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 46 ++++++++++++---------------------------- 1 file changed, 14 insertions(+), 32 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 90e0137ff784..0f54f63136fd 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1077,13 +1077,10 @@ static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io, { WARN_ON_ONCE(io->flags & UBLK_IO_FLAG_ACTIVE); - if (!(io->flags & UBLK_IO_FLAG_ABORTED)) { - io->flags |= UBLK_IO_FLAG_ABORTED; - if (ublk_queue_can_use_recovery_reissue(ubq)) - blk_mq_requeue_request(req, false); - else - ublk_put_req_ref(ubq, req); - } + if (ublk_queue_can_use_recovery_reissue(ubq)) + blk_mq_requeue_request(req, false); + else + ublk_put_req_ref(ubq, req); } static void ubq_complete_io_cmd(struct ublk_io *io, int res, @@ -1221,30 +1218,12 @@ static void ublk_rq_task_work_cb(struct io_uring_cmd *cmd, unsigned issue_flags) ublk_forward_io_cmds(ubq, issue_flags); } -static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq) +static inline void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq) { struct ublk_rq_data *data = blk_mq_rq_to_pdu(rq); - struct ublk_io *io; - - if (!llist_add(&data->node, &ubq->io_cmds)) - return; - io = &ubq->ios[rq->tag]; - /* - * If the check pass, we know that this is a re-issued request aborted - * previously in exit notifier because the ubq_daemon(cmd's task) is - * PF_EXITING. We cannot call io_uring_cmd_complete_in_task() anymore - * because this ioucmd's io_uring context may be freed now if no inflight - * ioucmd exists. Otherwise we may cause null-deref in ctx->fallback_work. - * - * Note: exit notifier sets UBLK_IO_FLAG_ABORTED and ends this request(releasing - * the tag). Then the request is re-started(allocating the tag) and we are here. - * Since releasing/allocating a tag implies smp_mb(), finding UBLK_IO_FLAG_ABORTED - * guarantees that here is a re-issued request aborted previously. - */ - if (unlikely(io->flags & UBLK_IO_FLAG_ABORTED)) { - ublk_abort_io_cmds(ubq); - } else { + if (llist_add(&data->node, &ubq->io_cmds)) { + struct ublk_io *io = &ubq->ios[rq->tag]; struct io_uring_cmd *cmd = io->cmd; struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd); @@ -1274,6 +1253,7 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx, { struct ublk_queue *ubq = hctx->driver_data; struct request *rq = bd->rq; + struct ublk_io *io = &ubq->ios[rq->tag]; blk_status_t res; /* fill iod to slot in io cmd buffer */ @@ -1293,13 +1273,12 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx, if (ublk_queue_can_use_recovery(ubq) && unlikely(ubq->force_abort)) return BLK_STS_IOERR; - blk_mq_start_request(bd->rq); - - if (unlikely(ubq_daemon_is_dying(ubq))) { + if (unlikely(io->flags & UBLK_IO_FLAG_ABORTED)) { __ublk_abort_rq(ubq, rq); return BLK_STS_OK; } + blk_mq_start_request(bd->rq); ublk_queue_cmd(ubq, rq); return BLK_STS_OK; @@ -1429,6 +1408,9 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) for (i = 0; i < ubq->q_depth; i++) { struct ublk_io *io = &ubq->ios[i]; + if (!(io->flags & UBLK_IO_FLAG_ABORTED)) + io->flags |= UBLK_IO_FLAG_ABORTED; + if (!(io->flags & UBLK_IO_FLAG_ACTIVE)) { struct request *rq; @@ -1437,7 +1419,7 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) * will do it */ rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i); - if (rq) + if (rq && blk_mq_request_started(rq)) __ublk_fail_req(ubq, io, rq); } }