From patchwork Tue Mar 11 16:00:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012236 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B19B9C282EC for ; Tue, 11 Mar 2025 16:06:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts25J-0007dc-B3; Tue, 11 Mar 2025 12:04:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts21z-0003sB-93 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts21x-0005fI-Fg for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708832; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=C4d7rCQo/2zgKQ6hXTlL5Or1bUZEDZQka54b0/Sz75k=; b=ZhXizS8MnnmI9KasVKTwQd0olVPO9M/6Mu83U3R0xqbOL07xOdD4LkxfBsYuEMF83MdTdS uQAq+Jy7jmV/FAV9tEEOSDRW0C5gsK8gq3aA4rJjlYhIuG5DRHrzd7NwWu9+rnIokyI0gD 3Ur73JgXew9tvqvvM1FQROxX9wPkK4o= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-616-1X7CheqzNceWke4e0pEBlw-1; Tue, 11 Mar 2025 12:00:29 -0400 X-MC-Unique: 1X7CheqzNceWke4e0pEBlw-1 X-Mimecast-MFC-AGG-ID: 1X7CheqzNceWke4e0pEBlw_1741708828 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1C2B81801A00; Tue, 11 Mar 2025 16:00:28 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 778301828A95; Tue, 11 Mar 2025 16:00:26 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 01/22] block: Remove unused blk_op_is_blocked() Date: Tue, 11 Mar 2025 17:00:00 +0100 Message-ID: <20250311160021.349761-2-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Commit fc4e394b28 removed the last caller of blk_op_is_blocked(). Remove the now unused function. Signed-off-by: Kevin Wolf Message-ID: <20250206165331.379033-1-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Stefan Hajnoczi Signed-off-by: Kevin Wolf --- include/system/block-backend-global-state.h | 1 - block/block-backend.c | 12 ------------ 2 files changed, 13 deletions(-) diff --git a/include/system/block-backend-global-state.h b/include/system/block-backend-global-state.h index 9cc9b008ec..35b5e8380b 100644 --- a/include/system/block-backend-global-state.h +++ b/include/system/block-backend-global-state.h @@ -86,7 +86,6 @@ bool blk_supports_write_perm(BlockBackend *blk); bool blk_is_sg(BlockBackend *blk); void blk_set_enable_write_cache(BlockBackend *blk, bool wce); int blk_get_flags(BlockBackend *blk); -bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp); int blk_set_aio_context(BlockBackend *blk, AioContext *new_context, Error **errp); void blk_add_aio_context_notifier(BlockBackend *blk, diff --git a/block/block-backend.c b/block/block-backend.c index 9288f7e1c6..a402db13f2 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2357,18 +2357,6 @@ void *blk_blockalign(BlockBackend *blk, size_t size) return qemu_blockalign(blk ? blk_bs(blk) : NULL, size); } -bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp) -{ - BlockDriverState *bs = blk_bs(blk); - GLOBAL_STATE_CODE(); - GRAPH_RDLOCK_GUARD_MAINLOOP(); - - if (!bs) { - return false; - } - - return bdrv_op_is_blocked(bs, op, errp); -} /** * Return BB's current AioContext. Note that this context may change From patchwork Tue Mar 11 16:00:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6BE6AC28B2F for ; Tue, 11 Mar 2025 16:05:25 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts25p-00086u-TP; Tue, 11 Mar 2025 12:04:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts221-0003vt-I3 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:37 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts21z-0005g5-C7 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708834; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5mO5SRVvROBnmvlo4dK2r1R6mGY6qiEPIuhViqul6cM=; b=eQsccjpsiD/GSAe+C8ZRucL4+TjMLb2XBi0XX/J4pdpH9Q0iazhB0yMu8apxV/zxF20hHg 9HSU2dtE+OrHHrvN0cG3KoB0uX7mmS0oEFi+vCqEIoD9nVnJgYYbm0GgVqlujny39Qqm+4 lYyYMJe3cM3MLJYW38nh7Z6zTK46I9I= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-577-0BWuWzDONlquJIdGSx_jBg-1; Tue, 11 Mar 2025 12:00:30 -0400 X-MC-Unique: 0BWuWzDONlquJIdGSx_jBg-1 X-Mimecast-MFC-AGG-ID: 0BWuWzDONlquJIdGSx_jBg_1741708830 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2802A19560A1; Tue, 11 Mar 2025 16:00:30 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 83E42180AF7C; Tue, 11 Mar 2025 16:00:28 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 02/22] block: Zero block driver state before reopening Date: Tue, 11 Mar 2025 17:00:01 +0100 Message-ID: <20250311160021.349761-3-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Block drivers assume in their .bdrv_open() implementation that their state in bs->opaque has been zeroed; it is initially allocated with g_malloc0() in bdrv_open_driver(). bdrv_snapshot_goto() needs to make sure that it is zeroed again before calling drv->bdrv_open() to avoid that block drivers use stale values. One symptom of this bug is VMDK running into a double free when the user tries to apply an internal snapshot like 'qemu-img snapshot -a test test.vmdk'. This should be a graceful error because VMDK doesn't support internal snapshots. ==25507== Invalid free() / delete / delete[] / realloc() ==25507== at 0x484B347: realloc (vg_replace_malloc.c:1801) ==25507== by 0x54B592A: g_realloc (gmem.c:171) ==25507== by 0x1B221D: vmdk_add_extent (../block/vmdk.c:570) ==25507== by 0x1B1084: vmdk_open_sparse (../block/vmdk.c:1059) ==25507== by 0x1AF3D8: vmdk_open (../block/vmdk.c:1371) ==25507== by 0x1A2AE0: bdrv_snapshot_goto (../block/snapshot.c:299) ==25507== by 0x205C77: img_snapshot (../qemu-img.c:3500) ==25507== by 0x58FA087: (below main) (libc_start_call_main.h:58) ==25507== Address 0x832f3e0 is 0 bytes inside a block of size 272 free'd ==25507== at 0x4846B83: free (vg_replace_malloc.c:989) ==25507== by 0x54AEAC4: g_free (gmem.c:208) ==25507== by 0x1AF629: vmdk_close (../block/vmdk.c:2889) ==25507== by 0x1A2A9C: bdrv_snapshot_goto (../block/snapshot.c:290) ==25507== by 0x205C77: img_snapshot (../qemu-img.c:3500) ==25507== by 0x58FA087: (below main) (libc_start_call_main.h:58) This error was discovered by fuzzing qemu-img. Cc: qemu-stable@nongnu.org Closes: https://gitlab.com/qemu-project/qemu/-/issues/2853 Closes: https://gitlab.com/qemu-project/qemu/-/issues/2851 Reported-by: Denis Rastyogin Signed-off-by: Kevin Wolf Message-ID: <20250310104858.28221-1-kwolf@redhat.com> Signed-off-by: Kevin Wolf --- block/snapshot.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/snapshot.c b/block/snapshot.c index 9c44780e96..22567f1fb9 100644 --- a/block/snapshot.c +++ b/block/snapshot.c @@ -296,6 +296,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs, bdrv_graph_wrunlock(); ret = bdrv_snapshot_goto(fallback_bs, snapshot_id, errp); + memset(bs->opaque, 0, drv->instance_size); open_ret = drv->bdrv_open(bs, options, bs->open_flags, &local_err); qobject_unref(options); if (open_ret < 0) { From patchwork Tue Mar 11 16:00:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B7A5C282EC for ; Tue, 11 Mar 2025 16:09:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts27F-0000pS-4z; Tue, 11 Mar 2025 12:06:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts226-000404-PD for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts222-0005ga-Av for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708836; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AhbSDLW1mvIdHsTaWJpwcwG5e2v3Jhauj4Y3iWtwrmc=; b=CDZlShjCKc2I/N7fpORGlKaJlcyyjFPfxMkRmwh+TpjyqgIIsjvCwkZB8c7yjB11tnJtQE je87K/iyCwXeEHh7FtyehLu/LJxwr/k0bNelvHbF6d/uJsCYDp4I01QiVgDL5JyWVN6iuy uL0YyuEd9ESvha3SoH3Acjq5llgEcSk= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-164-1x9wGZCgOwCz-g683vM4-w-1; Tue, 11 Mar 2025 12:00:33 -0400 X-MC-Unique: 1x9wGZCgOwCz-g683vM4-w-1 X-Mimecast-MFC-AGG-ID: 1x9wGZCgOwCz-g683vM4-w_1741708832 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4C173180049D; Tue, 11 Mar 2025 16:00:32 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9CBF91800373; Tue, 11 Mar 2025 16:00:30 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 03/22] file-posix: Support FUA writes Date: Tue, 11 Mar 2025 17:00:02 +0100 Message-ID: <20250311160021.349761-4-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Until now, FUA was always emulated with a separate flush after the write for file-posix. The overhead of processing a second request can reduce performance significantly for a guest disk that has disabled the write cache, especially if the host disk is already write through, too, and the flush isn't actually doing anything. Advertise support for REQ_FUA in write requests and implement it for Linux AIO and io_uring using the RWF_DSYNC flag for write requests. The thread pool still performs a separate fdatasync() call. This can be improved later by using the pwritev2() syscall if available. As an example, this is how fio numbers can be improved in some scenarios with this patch (all using virtio-blk with cache=directsync on an nvme block device for the VM, fio with ioengine=libaio,direct=1,sync=1): | old | with FUA support ------------------------------+---------------+------------------- bs=4k, iodepth=1, numjobs=1 | 45.6k iops | 56.1k iops bs=4k, iodepth=1, numjobs=16 | 183.3k iops | 236.0k iops bs=4k, iodepth=16, numjobs=1 | 258.4k iops | 311.1k iops However, not all scenarios are clear wins. On another slower disk I saw little to no improvment. In fact, in two corner case scenarios, I even observed a regression, which I however consider acceptable: 1. On slow host disks in a write through cache mode, when the guest is using virtio-blk in a separate iothread so that polling can be enabled, and each completion is quickly followed up with a new request (so that polling gets it), it can happen that enabling FUA makes things slower - the additional very fast no-op flush we used to have gave the adaptive polling algorithm a success so that it kept polling. Without it, we only have the slow write request, which disables polling. This is a problem in the polling algorithm that will be fixed later in this series. 2. With a high queue depth, it can be beneficial to have flush requests for another reason: The optimisation in bdrv_co_flush() that flushes only once per write generation acts as a synchronisation mechanism that lets all requests complete at the same time. This can result in better batching and if the disk is very fast (I only saw this with a null_blk backend), this can make up for the overhead of the flush and improve throughput. In theory, we could optionally introduce a similar artificial latency in the normal completion path to achieve the same kind of completion batching. This is not implemented in this series. Compatibility is not a concern for io_uring, it has supported RWF_DSYNC from the start. Linux AIO started supporting it in Linux 4.13 and libaio 0.3.111. The kernel is not a problem for any supported build platform, so it's not necessary to add runtime checks. However, openSUSE is still stuck with an older libaio version that would break the build. We must detect this at build time to avoid build failures. Signed-off-by: Kevin Wolf Message-ID: <20250307221634.71951-2-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi Signed-off-by: Kevin Wolf --- include/block/raw-aio.h | 8 ++++++-- block/file-posix.c | 26 ++++++++++++++++++-------- block/io_uring.c | 13 ++++++++----- block/linux-aio.c | 24 +++++++++++++++++++++--- meson.build | 4 ++++ 5 files changed, 57 insertions(+), 18 deletions(-) diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h index 626706827f..247bdbff13 100644 --- a/include/block/raw-aio.h +++ b/include/block/raw-aio.h @@ -17,6 +17,7 @@ #define QEMU_RAW_AIO_H #include "block/aio.h" +#include "block/block-common.h" #include "qemu/iov.h" /* AIO request types */ @@ -58,9 +59,11 @@ void laio_cleanup(LinuxAioState *s); /* laio_co_submit: submit I/O requests in the thread's current AioContext. */ int coroutine_fn laio_co_submit(int fd, uint64_t offset, QEMUIOVector *qiov, - int type, uint64_t dev_max_batch); + int type, BdrvRequestFlags flags, + uint64_t dev_max_batch); bool laio_has_fdsync(int); +bool laio_has_fua(void); void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context); void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context); #endif @@ -71,7 +74,8 @@ void luring_cleanup(LuringState *s); /* luring_co_submit: submit I/O requests in the thread's current AioContext. */ int coroutine_fn luring_co_submit(BlockDriverState *bs, int fd, uint64_t offset, - QEMUIOVector *qiov, int type); + QEMUIOVector *qiov, int type, + BdrvRequestFlags flags); void luring_detach_aio_context(LuringState *s, AioContext *old_context); void luring_attach_aio_context(LuringState *s, AioContext *new_context); #endif diff --git a/block/file-posix.c b/block/file-posix.c index 44e16dda87..0f1c722804 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -194,6 +194,7 @@ static int fd_open(BlockDriverState *bs) } static int64_t raw_getlength(BlockDriverState *bs); +static int coroutine_fn raw_co_flush_to_disk(BlockDriverState *bs); typedef struct RawPosixAIOData { BlockDriverState *bs; @@ -804,6 +805,10 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, #endif s->needs_alignment = raw_needs_alignment(bs); + if (!s->use_linux_aio || laio_has_fua()) { + bs->supported_write_flags = BDRV_REQ_FUA; + } + bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK; if (S_ISREG(st.st_mode)) { /* When extending regular files, we get zeros from the OS */ @@ -2477,7 +2482,8 @@ static inline bool raw_check_linux_aio(BDRVRawState *s) #endif static int coroutine_fn raw_co_prw(BlockDriverState *bs, int64_t *offset_ptr, - uint64_t bytes, QEMUIOVector *qiov, int type) + uint64_t bytes, QEMUIOVector *qiov, int type, + int flags) { BDRVRawState *s = bs->opaque; RawPosixAIOData acb; @@ -2508,13 +2514,13 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, int64_t *offset_ptr, #ifdef CONFIG_LINUX_IO_URING } else if (raw_check_linux_io_uring(s)) { assert(qiov->size == bytes); - ret = luring_co_submit(bs, s->fd, offset, qiov, type); + ret = luring_co_submit(bs, s->fd, offset, qiov, type, flags); goto out; #endif #ifdef CONFIG_LINUX_AIO } else if (raw_check_linux_aio(s)) { assert(qiov->size == bytes); - ret = laio_co_submit(s->fd, offset, qiov, type, + ret = laio_co_submit(s->fd, offset, qiov, type, flags, s->aio_max_batch); goto out; #endif @@ -2534,6 +2540,10 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, int64_t *offset_ptr, assert(qiov->size == bytes); ret = raw_thread_pool_submit(handle_aiocb_rw, &acb); + if (ret == 0 && (flags & BDRV_REQ_FUA)) { + /* TODO Use pwritev2() instead if it's available */ + ret = raw_co_flush_to_disk(bs); + } goto out; /* Avoid the compiler err of unused label */ out: @@ -2571,14 +2581,14 @@ static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { - return raw_co_prw(bs, &offset, bytes, qiov, QEMU_AIO_READ); + return raw_co_prw(bs, &offset, bytes, qiov, QEMU_AIO_READ, flags); } static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { - return raw_co_prw(bs, &offset, bytes, qiov, QEMU_AIO_WRITE); + return raw_co_prw(bs, &offset, bytes, qiov, QEMU_AIO_WRITE, flags); } static int coroutine_fn raw_co_flush_to_disk(BlockDriverState *bs) @@ -2600,12 +2610,12 @@ static int coroutine_fn raw_co_flush_to_disk(BlockDriverState *bs) #ifdef CONFIG_LINUX_IO_URING if (raw_check_linux_io_uring(s)) { - return luring_co_submit(bs, s->fd, 0, NULL, QEMU_AIO_FLUSH); + return luring_co_submit(bs, s->fd, 0, NULL, QEMU_AIO_FLUSH, 0); } #endif #ifdef CONFIG_LINUX_AIO if (s->has_laio_fdsync && raw_check_linux_aio(s)) { - return laio_co_submit(s->fd, 0, NULL, QEMU_AIO_FLUSH, 0); + return laio_co_submit(s->fd, 0, NULL, QEMU_AIO_FLUSH, 0, 0); } #endif return raw_thread_pool_submit(handle_aiocb_flush, &acb); @@ -3540,7 +3550,7 @@ static int coroutine_fn raw_co_zone_append(BlockDriverState *bs, } trace_zbd_zone_append(bs, *offset >> BDRV_SECTOR_BITS); - return raw_co_prw(bs, offset, len, qiov, QEMU_AIO_ZONE_APPEND); + return raw_co_prw(bs, offset, len, qiov, QEMU_AIO_ZONE_APPEND, 0); } #endif diff --git a/block/io_uring.c b/block/io_uring.c index f52b66b340..dc967dbf91 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -335,15 +335,17 @@ static void luring_deferred_fn(void *opaque) * */ static int luring_do_submit(int fd, LuringAIOCB *luringcb, LuringState *s, - uint64_t offset, int type) + uint64_t offset, int type, BdrvRequestFlags flags) { int ret; struct io_uring_sqe *sqes = &luringcb->sqeq; + int luring_flags; switch (type) { case QEMU_AIO_WRITE: - io_uring_prep_writev(sqes, fd, luringcb->qiov->iov, - luringcb->qiov->niov, offset); + luring_flags = (flags & BDRV_REQ_FUA) ? RWF_DSYNC : 0; + io_uring_prep_writev2(sqes, fd, luringcb->qiov->iov, + luringcb->qiov->niov, offset, luring_flags); break; case QEMU_AIO_ZONE_APPEND: io_uring_prep_writev(sqes, fd, luringcb->qiov->iov, @@ -380,7 +382,8 @@ static int luring_do_submit(int fd, LuringAIOCB *luringcb, LuringState *s, } int coroutine_fn luring_co_submit(BlockDriverState *bs, int fd, uint64_t offset, - QEMUIOVector *qiov, int type) + QEMUIOVector *qiov, int type, + BdrvRequestFlags flags) { int ret; AioContext *ctx = qemu_get_current_aio_context(); @@ -393,7 +396,7 @@ int coroutine_fn luring_co_submit(BlockDriverState *bs, int fd, uint64_t offset, }; trace_luring_co_submit(bs, s, &luringcb, fd, offset, qiov ? qiov->size : 0, type); - ret = luring_do_submit(fd, &luringcb, s, offset, type); + ret = luring_do_submit(fd, &luringcb, s, offset, type, flags); if (ret < 0) { return ret; diff --git a/block/linux-aio.c b/block/linux-aio.c index 194c8f434f..1108ae361c 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -368,15 +368,23 @@ static void laio_deferred_fn(void *opaque) } static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, - int type, uint64_t dev_max_batch) + int type, BdrvRequestFlags flags, + uint64_t dev_max_batch) { LinuxAioState *s = laiocb->ctx; struct iocb *iocbs = &laiocb->iocb; QEMUIOVector *qiov = laiocb->qiov; + int laio_flags; switch (type) { case QEMU_AIO_WRITE: +#ifdef HAVE_IO_PREP_PWRITEV2 + laio_flags = (flags & BDRV_REQ_FUA) ? RWF_DSYNC : 0; + io_prep_pwritev2(iocbs, fd, qiov->iov, qiov->niov, offset, laio_flags); +#else + assert(flags == 0); io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); +#endif break; case QEMU_AIO_ZONE_APPEND: io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); @@ -409,7 +417,8 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, } int coroutine_fn laio_co_submit(int fd, uint64_t offset, QEMUIOVector *qiov, - int type, uint64_t dev_max_batch) + int type, BdrvRequestFlags flags, + uint64_t dev_max_batch) { int ret; AioContext *ctx = qemu_get_current_aio_context(); @@ -422,7 +431,7 @@ int coroutine_fn laio_co_submit(int fd, uint64_t offset, QEMUIOVector *qiov, .qiov = qiov, }; - ret = laio_do_submit(fd, &laiocb, offset, type, dev_max_batch); + ret = laio_do_submit(fd, &laiocb, offset, type, flags, dev_max_batch); if (ret < 0) { return ret; } @@ -505,3 +514,12 @@ bool laio_has_fdsync(int fd) io_destroy(ctx); return (ret == -EINVAL) ? false : true; } + +bool laio_has_fua(void) +{ +#ifdef HAVE_IO_PREP_PWRITEV2 + return true; +#else + return false; +#endif +} diff --git a/meson.build b/meson.build index 9d9c11731f..3dad23794e 100644 --- a/meson.build +++ b/meson.build @@ -2727,6 +2727,10 @@ config_host_data.set('HAVE_OPTRESET', cc.has_header_symbol('getopt.h', 'optreset')) config_host_data.set('HAVE_IPPROTO_MPTCP', cc.has_header_symbol('netinet/in.h', 'IPPROTO_MPTCP')) +if libaio.found() + config_host_data.set('HAVE_IO_PREP_PWRITEV2', + cc.has_header_symbol('libaio.h', 'io_prep_pwritev2')) +endif # has_member config_host_data.set('HAVE_SIGEV_NOTIFY_THREAD_ID', From patchwork Tue Mar 11 16:00:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012234 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 78759C282EC for ; Tue, 11 Mar 2025 16:04:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts250-0006vf-RY; Tue, 11 Mar 2025 12:03:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts228-00040D-PV for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:48 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts223-0005hG-KJ for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708838; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AhHL2Fdt+g9QlS9SfWAj9l1MOwzrzbrrezwChzAgHIQ=; b=d7Oin3Z1R6bSn5qxiJC7FaREe728MSan9wWIkL5vnY0PnL3ESkzhmiqVu+NnEUQNh100T3 Gif1ebh9woqxSYcjODQdYsenUbrMkg89qlfXn/LlSEo+IW+EAtEPnHvUE2x/FTkYXlKh1d GsBmfr2Lto/etxFKSyUhyb92ETs4Pec= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-68-0c4lZ8eENoemJB67w3EoWw-1; Tue, 11 Mar 2025 12:00:35 -0400 X-MC-Unique: 0c4lZ8eENoemJB67w3EoWw-1 X-Mimecast-MFC-AGG-ID: 0c4lZ8eENoemJB67w3EoWw_1741708834 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6071A19560A2; Tue, 11 Mar 2025 16:00:34 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C1ED71800373; Tue, 11 Mar 2025 16:00:32 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 04/22] block/io: Ignore FUA with cache.no-flush=on Date: Tue, 11 Mar 2025 17:00:03 +0100 Message-ID: <20250311160021.349761-5-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org For block drivers that don't advertise FUA support, we already call bdrv_co_flush(), which considers BDRV_O_NO_FLUSH. However, drivers that do support FUA still see the FUA flag with BDRV_O_NO_FLUSH and get the associated performance penalty that cache.no-flush=on was supposed to avoid. Clear FUA for write requests if BDRV_O_NO_FLUSH is set. Signed-off-by: Kevin Wolf Message-ID: <20250307221634.71951-3-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi Signed-off-by: Kevin Wolf --- block/io.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/block/io.c b/block/io.c index d369b994df..1ba8d1aeea 100644 --- a/block/io.c +++ b/block/io.c @@ -1058,6 +1058,10 @@ bdrv_driver_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, return -ENOMEDIUM; } + if (bs->open_flags & BDRV_O_NO_FLUSH) { + flags &= ~BDRV_REQ_FUA; + } + if ((flags & BDRV_REQ_FUA) && (~bs->supported_write_flags & BDRV_REQ_FUA)) { flags &= ~BDRV_REQ_FUA; From patchwork Tue Mar 11 16:00:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F17F6C2BA1B for ; Tue, 11 Mar 2025 16:10:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts26X-0000NV-O4; Tue, 11 Mar 2025 12:05:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts226-0003zR-63 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:42 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts223-0005hI-Ol for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708838; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AEHUeb7rP3pziRM5M1M5vXuLC6X6GpTgsDjSzUiJWqg=; b=Sz4Xf2RdLAqy+mmkP8lWI44dCY97N8wGhqXVU5AE5AtY33JKfD9TaMtKtkNjejBU9SrmrB YedEKEZPd8gTGCEkAb1aR64C6msD2oszArtdaPyfjFTrkLPS2+tRuDcW97ZmeFbMD4moRF BM7m58d5OaPxgu6f1AqQk/Yi0huIa2w= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-548-f2R_OsgWMsCoG6vfiraPMA-1; Tue, 11 Mar 2025 12:00:37 -0400 X-MC-Unique: f2R_OsgWMsCoG6vfiraPMA-1 X-Mimecast-MFC-AGG-ID: f2R_OsgWMsCoG6vfiraPMA_1741708836 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5E3D319560AF; Tue, 11 Mar 2025 16:00:36 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D46EA180AF7B; Tue, 11 Mar 2025 16:00:34 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 05/22] aio: Create AioPolledEvent Date: Tue, 11 Mar 2025 17:00:04 +0100 Message-ID: <20250311160021.349761-6-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org As a preparation for having multiple adaptive polling states per AioContext, move the 'ns' field into a separate struct. Signed-off-by: Kevin Wolf Message-ID: <20250307221634.71951-4-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi Signed-off-by: Kevin Wolf --- include/block/aio.h | 6 +++++- util/aio-posix.c | 31 ++++++++++++++++--------------- util/async.c | 3 ++- 3 files changed, 23 insertions(+), 17 deletions(-) diff --git a/include/block/aio.h b/include/block/aio.h index b2ab3514de..c9fcfe7ccf 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -123,6 +123,10 @@ struct BHListSlice { typedef QSLIST_HEAD(, AioHandler) AioHandlerSList; +typedef struct AioPolledEvent { + int64_t ns; /* current polling time in nanoseconds */ +} AioPolledEvent; + struct AioContext { GSource source; @@ -229,7 +233,7 @@ struct AioContext { int poll_disable_cnt; /* Polling mode parameters */ - int64_t poll_ns; /* current polling time in nanoseconds */ + AioPolledEvent poll; int64_t poll_max_ns; /* maximum polling time in nanoseconds */ int64_t poll_grow; /* polling time growth factor */ int64_t poll_shrink; /* polling time shrink factor */ diff --git a/util/aio-posix.c b/util/aio-posix.c index 06bf9f456c..95bddb9e4b 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -585,7 +585,7 @@ static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list, return false; } - max_ns = qemu_soonest_timeout(*timeout, ctx->poll_ns); + max_ns = qemu_soonest_timeout(*timeout, ctx->poll.ns); if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) { /* * Enable poll mode. It pairs with the poll_set_started() in @@ -683,40 +683,40 @@ bool aio_poll(AioContext *ctx, bool blocking) if (ctx->poll_max_ns) { int64_t block_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start; - if (block_ns <= ctx->poll_ns) { + if (block_ns <= ctx->poll.ns) { /* This is the sweet spot, no adjustment needed */ } else if (block_ns > ctx->poll_max_ns) { /* We'd have to poll for too long, poll less */ - int64_t old = ctx->poll_ns; + int64_t old = ctx->poll.ns; if (ctx->poll_shrink) { - ctx->poll_ns /= ctx->poll_shrink; + ctx->poll.ns /= ctx->poll_shrink; } else { - ctx->poll_ns = 0; + ctx->poll.ns = 0; } - trace_poll_shrink(ctx, old, ctx->poll_ns); - } else if (ctx->poll_ns < ctx->poll_max_ns && + trace_poll_shrink(ctx, old, ctx->poll.ns); + } else if (ctx->poll.ns < ctx->poll_max_ns && block_ns < ctx->poll_max_ns) { /* There is room to grow, poll longer */ - int64_t old = ctx->poll_ns; + int64_t old = ctx->poll.ns; int64_t grow = ctx->poll_grow; if (grow == 0) { grow = 2; } - if (ctx->poll_ns) { - ctx->poll_ns *= grow; + if (ctx->poll.ns) { + ctx->poll.ns *= grow; } else { - ctx->poll_ns = 4000; /* start polling at 4 microseconds */ + ctx->poll.ns = 4000; /* start polling at 4 microseconds */ } - if (ctx->poll_ns > ctx->poll_max_ns) { - ctx->poll_ns = ctx->poll_max_ns; + if (ctx->poll.ns > ctx->poll_max_ns) { + ctx->poll.ns = ctx->poll_max_ns; } - trace_poll_grow(ctx, old, ctx->poll_ns); + trace_poll_grow(ctx, old, ctx->poll.ns); } } @@ -770,8 +770,9 @@ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, /* No thread synchronization here, it doesn't matter if an incorrect value * is used once. */ + ctx->poll.ns = 0; + ctx->poll_max_ns = max_ns; - ctx->poll_ns = 0; ctx->poll_grow = grow; ctx->poll_shrink = shrink; diff --git a/util/async.c b/util/async.c index 47e3d35a26..fc8a78aa79 100644 --- a/util/async.c +++ b/util/async.c @@ -609,7 +609,8 @@ AioContext *aio_context_new(Error **errp) qemu_rec_mutex_init(&ctx->lock); timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx); - ctx->poll_ns = 0; + ctx->poll.ns = 0; + ctx->poll_max_ns = 0; ctx->poll_grow = 0; ctx->poll_shrink = 0; From patchwork Tue Mar 11 16:00:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6296C282EC for ; Tue, 11 Mar 2025 16:07:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts27K-0000yC-N4; Tue, 11 Mar 2025 12:06:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22B-00041a-Ia for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:48 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts226-0005hj-JQ for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708840; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CDc7/ZAT5C8YLyxi7akQ15Oo1z5OtAKquDiEGruG1tY=; b=LA+1dwO6iJMprk3XuCwJ1WKYE9g0dWsLBmqj5AJoJd74IpXWphJJgJwSWHAmU2iA1gY3LW NYjileGBz/wj5C9VKRUgXoUDSPP8omhLjL0iKNp29zxLHlnu+mX+fuB5LUbgZNhHZ+7uRs K3MYLsJgkqX77HMILOqZujy5S4BKXjA= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-508-gR-2nzPMMbi1r7GCSJqDSg-1; Tue, 11 Mar 2025 12:00:39 -0400 X-MC-Unique: gR-2nzPMMbi1r7GCSJqDSg-1 X-Mimecast-MFC-AGG-ID: gR-2nzPMMbi1r7GCSJqDSg_1741708838 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7B5D519560BB; Tue, 11 Mar 2025 16:00:38 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D788F1800373; Tue, 11 Mar 2025 16:00:36 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 06/22] aio-posix: Factor out adjust_polling_time() Date: Tue, 11 Mar 2025 17:00:05 +0100 Message-ID: <20250311160021.349761-7-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Kevin Wolf Message-ID: <20250307221634.71951-5-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi Signed-off-by: Kevin Wolf --- util/aio-posix.c | 77 ++++++++++++++++++++++++++---------------------- 1 file changed, 41 insertions(+), 36 deletions(-) diff --git a/util/aio-posix.c b/util/aio-posix.c index 95bddb9e4b..259827c7ad 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -600,6 +600,46 @@ static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list, return false; } +static void adjust_polling_time(AioContext *ctx, AioPolledEvent *poll, + int64_t block_ns) +{ + if (block_ns <= poll->ns) { + /* This is the sweet spot, no adjustment needed */ + } else if (block_ns > ctx->poll_max_ns) { + /* We'd have to poll for too long, poll less */ + int64_t old = poll->ns; + + if (ctx->poll_shrink) { + poll->ns /= ctx->poll_shrink; + } else { + poll->ns = 0; + } + + trace_poll_shrink(ctx, old, poll->ns); + } else if (poll->ns < ctx->poll_max_ns && + block_ns < ctx->poll_max_ns) { + /* There is room to grow, poll longer */ + int64_t old = poll->ns; + int64_t grow = ctx->poll_grow; + + if (grow == 0) { + grow = 2; + } + + if (poll->ns) { + poll->ns *= grow; + } else { + poll->ns = 4000; /* start polling at 4 microseconds */ + } + + if (poll->ns > ctx->poll_max_ns) { + poll->ns = ctx->poll_max_ns; + } + + trace_poll_grow(ctx, old, poll->ns); + } +} + bool aio_poll(AioContext *ctx, bool blocking) { AioHandlerList ready_list = QLIST_HEAD_INITIALIZER(ready_list); @@ -682,42 +722,7 @@ bool aio_poll(AioContext *ctx, bool blocking) /* Adjust polling time */ if (ctx->poll_max_ns) { int64_t block_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start; - - if (block_ns <= ctx->poll.ns) { - /* This is the sweet spot, no adjustment needed */ - } else if (block_ns > ctx->poll_max_ns) { - /* We'd have to poll for too long, poll less */ - int64_t old = ctx->poll.ns; - - if (ctx->poll_shrink) { - ctx->poll.ns /= ctx->poll_shrink; - } else { - ctx->poll.ns = 0; - } - - trace_poll_shrink(ctx, old, ctx->poll.ns); - } else if (ctx->poll.ns < ctx->poll_max_ns && - block_ns < ctx->poll_max_ns) { - /* There is room to grow, poll longer */ - int64_t old = ctx->poll.ns; - int64_t grow = ctx->poll_grow; - - if (grow == 0) { - grow = 2; - } - - if (ctx->poll.ns) { - ctx->poll.ns *= grow; - } else { - ctx->poll.ns = 4000; /* start polling at 4 microseconds */ - } - - if (ctx->poll.ns > ctx->poll_max_ns) { - ctx->poll.ns = ctx->poll_max_ns; - } - - trace_poll_grow(ctx, old, ctx->poll.ns); - } + adjust_polling_time(ctx, &ctx->poll, block_ns); } progress |= aio_bh_poll(ctx); From patchwork Tue Mar 11 16:00:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012254 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 94AF9C282EC for ; Tue, 11 Mar 2025 16:10:38 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts29F-00039A-3H; Tue, 11 Mar 2025 12:08:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22I-0004Cx-Ta for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:55 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts229-0005iA-VM for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708843; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TgaVlRVnHPlvnZVOn+8a9rDfbPZzB3p3yplbpO7u+e4=; b=Fp5493Zk9nM3hwOiqblmfnPR51z8LjISaPxyS1zOvt9S7HNdTldnlqXWZ11Upkee/v42xU CvbvY7bZqLpgzr9l/ofaAeyG6GXAJ1CVbR3HxcmGsfw3I6Fe9ZJ9IpNzPWBHVTV5Zv19rd O2LRt2wDr1qEfW10xuHoM0UeFXZkYS8= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-47-q5uHGvPSPzabYAdpAXhOqQ-1; Tue, 11 Mar 2025 12:00:41 -0400 X-MC-Unique: q5uHGvPSPzabYAdpAXhOqQ-1 X-Mimecast-MFC-AGG-ID: q5uHGvPSPzabYAdpAXhOqQ_1741708840 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 80FE91800D9F; Tue, 11 Mar 2025 16:00:40 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 022851828A95; Tue, 11 Mar 2025 16:00:38 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 07/22] aio-posix: Separate AioPolledEvent per AioHandler Date: Tue, 11 Mar 2025 17:00:06 +0100 Message-ID: <20250311160021.349761-8-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Adaptive polling has a big problem: It doesn't consider that an event loop can wait for many different events that may have very different typical latencies. For example, think of a guest that tends to send a new I/O request soon after the previous I/O request completes, but the storage on the host is rather slow. In this case, getting the new request from guest quickly means that polling is enabled, but the next thing is performing the I/O request on the backend, which is slow and disables polling again for the next guest request. This means that in such a scenario, polling could help for every other event, but is only ever enabled when it can't succeed. In order to fix this, keep a separate AioPolledEvent for each AioHandler. We will then know that the backend file descriptor always has a high latency and isn't worth polling for, but we also know that the guest is always fast and we should poll for it. This solves at least half of the problem, we can now keep polling for those cases where it makes sense and get the improved performance from it. Since the event loop doesn't know which event will be next, we still do some unnecessary polling while we're waiting for the slow disk. I made some attempts to be more clever than just randomly growing and shrinking the polling time, and even to let callers be explicit about when they expect a new event, but so far this hasn't resulted in improved performance or even caused performance regressions. For now, let's just fix the part that is easy enough to fix, we can revisit the rest later. Signed-off-by: Kevin Wolf Message-ID: <20250307221634.71951-6-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi Signed-off-by: Kevin Wolf --- include/block/aio.h | 1 - util/aio-posix.h | 1 + util/aio-posix.c | 26 ++++++++++++++++++++++---- util/async.c | 2 -- 4 files changed, 23 insertions(+), 7 deletions(-) diff --git a/include/block/aio.h b/include/block/aio.h index c9fcfe7ccf..99ff48420b 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -233,7 +233,6 @@ struct AioContext { int poll_disable_cnt; /* Polling mode parameters */ - AioPolledEvent poll; int64_t poll_max_ns; /* maximum polling time in nanoseconds */ int64_t poll_grow; /* polling time growth factor */ int64_t poll_shrink; /* polling time shrink factor */ diff --git a/util/aio-posix.h b/util/aio-posix.h index 4264c518be..82a0201ea4 100644 --- a/util/aio-posix.h +++ b/util/aio-posix.h @@ -38,6 +38,7 @@ struct AioHandler { #endif int64_t poll_idle_timeout; /* when to stop userspace polling */ bool poll_ready; /* has polling detected an event? */ + AioPolledEvent poll; }; /* Add a handler to a ready list */ diff --git a/util/aio-posix.c b/util/aio-posix.c index 259827c7ad..80785c29d2 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -579,13 +579,19 @@ static bool run_poll_handlers(AioContext *ctx, AioHandlerList *ready_list, static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list, int64_t *timeout) { + AioHandler *node; int64_t max_ns; if (QLIST_EMPTY_RCU(&ctx->poll_aio_handlers)) { return false; } - max_ns = qemu_soonest_timeout(*timeout, ctx->poll.ns); + max_ns = 0; + QLIST_FOREACH(node, &ctx->poll_aio_handlers, node_poll) { + max_ns = MAX(max_ns, node->poll.ns); + } + max_ns = qemu_soonest_timeout(*timeout, max_ns); + if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) { /* * Enable poll mode. It pairs with the poll_set_started() in @@ -721,8 +727,14 @@ bool aio_poll(AioContext *ctx, bool blocking) /* Adjust polling time */ if (ctx->poll_max_ns) { + AioHandler *node; int64_t block_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start; - adjust_polling_time(ctx, &ctx->poll, block_ns); + + QLIST_FOREACH(node, &ctx->poll_aio_handlers, node_poll) { + if (QLIST_IS_INSERTED(node, node_ready)) { + adjust_polling_time(ctx, &node->poll, block_ns); + } + } } progress |= aio_bh_poll(ctx); @@ -772,11 +784,17 @@ void aio_context_use_g_source(AioContext *ctx) void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, int64_t grow, int64_t shrink, Error **errp) { + AioHandler *node; + + qemu_lockcnt_inc(&ctx->list_lock); + QLIST_FOREACH(node, &ctx->aio_handlers, node) { + node->poll.ns = 0; + } + qemu_lockcnt_dec(&ctx->list_lock); + /* No thread synchronization here, it doesn't matter if an incorrect value * is used once. */ - ctx->poll.ns = 0; - ctx->poll_max_ns = max_ns; ctx->poll_grow = grow; ctx->poll_shrink = shrink; diff --git a/util/async.c b/util/async.c index fc8a78aa79..863416dee9 100644 --- a/util/async.c +++ b/util/async.c @@ -609,8 +609,6 @@ AioContext *aio_context_new(Error **errp) qemu_rec_mutex_init(&ctx->lock); timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx); - ctx->poll.ns = 0; - ctx->poll_max_ns = 0; ctx->poll_grow = 0; ctx->poll_shrink = 0; From patchwork Tue Mar 11 16:00:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C4976C35FF1 for ; Tue, 11 Mar 2025 16:10:17 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts27r-0001dI-VG; Tue, 11 Mar 2025 12:06:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22D-00044C-N8 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:50 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22B-0005j0-QR for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708846; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GDW6dSOze4AXVEFycmqqUI4XjjNB4hqnpaIsI5KoTEQ=; b=NgR4QlwHjUYqKwbmmbYxzC5gk1vGzC9UYSnX9lhFNOwH/xqdUzOK788RgYFesl3gHFGNTK BopWnVvGiPhPM5bsI1jPJAe2aUygpoH0jcchvLTgGV3xlyQaHzNP3fNN9TnPLpjTr/nHPP KovRr+oSPacXU4rlZOjbE0VEXiCPJ1E= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-355-hgG9bouPOh29Zv4N9rzJQg-1; Tue, 11 Mar 2025 12:00:43 -0400 X-MC-Unique: hgG9bouPOh29Zv4N9rzJQg-1 X-Mimecast-MFC-AGG-ID: hgG9bouPOh29Zv4N9rzJQg_1741708842 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A32D9180025B; Tue, 11 Mar 2025 16:00:42 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 070401800373; Tue, 11 Mar 2025 16:00:40 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 08/22] aio-posix: Adjust polling time also for new handlers Date: Tue, 11 Mar 2025 17:00:07 +0100 Message-ID: <20250311160021.349761-9-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org aio_dispatch_handler() adds handlers to ctx->poll_aio_handlers if polling should be enabled. If we call adjust_polling_time() for all polling handlers before this, new polling handlers are still left at poll->ns = 0 and polling is only actually enabled after the next event. Move the adjust_polling_time() call after aio_dispatch_handler(). This fixes test-nested-aio-poll, which expects that polling becomes effective the first time around. Signed-off-by: Kevin Wolf Message-ID: <20250311141912.135657-1-kwolf@redhat.com> Signed-off-by: Kevin Wolf --- util/aio-posix.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/util/aio-posix.c b/util/aio-posix.c index 80785c29d2..2e0a5dadc4 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -28,6 +28,9 @@ /* Stop userspace polling on a handler if it isn't active for some time */ #define POLL_IDLE_INTERVAL_NS (7 * NANOSECONDS_PER_SECOND) +static void adjust_polling_time(AioContext *ctx, AioPolledEvent *poll, + int64_t block_ns); + bool aio_poll_disabled(AioContext *ctx) { return qatomic_read(&ctx->poll_disable_cnt); @@ -392,7 +395,8 @@ static bool aio_dispatch_handler(AioContext *ctx, AioHandler *node) * scanning all handlers with aio_dispatch_handlers(). */ static bool aio_dispatch_ready_handlers(AioContext *ctx, - AioHandlerList *ready_list) + AioHandlerList *ready_list, + int64_t block_ns) { bool progress = false; AioHandler *node; @@ -400,6 +404,14 @@ static bool aio_dispatch_ready_handlers(AioContext *ctx, while ((node = QLIST_FIRST(ready_list))) { QLIST_REMOVE(node, node_ready); progress = aio_dispatch_handler(ctx, node) || progress; + + /* + * Adjust polling time only after aio_dispatch_handler(), which can + * add the handler to ctx->poll_aio_handlers. + */ + if (ctx->poll_max_ns && QLIST_IS_INSERTED(node, node_poll)) { + adjust_polling_time(ctx, &node->poll, block_ns); + } } return progress; @@ -653,6 +665,7 @@ bool aio_poll(AioContext *ctx, bool blocking) bool use_notify_me; int64_t timeout; int64_t start = 0; + int64_t block_ns = 0; /* * There cannot be two concurrent aio_poll calls for the same AioContext (or @@ -725,20 +738,13 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_notify_accept(ctx); - /* Adjust polling time */ + /* Calculate blocked time for adaptive polling */ if (ctx->poll_max_ns) { - AioHandler *node; - int64_t block_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start; - - QLIST_FOREACH(node, &ctx->poll_aio_handlers, node_poll) { - if (QLIST_IS_INSERTED(node, node_ready)) { - adjust_polling_time(ctx, &node->poll, block_ns); - } - } + block_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start; } progress |= aio_bh_poll(ctx); - progress |= aio_dispatch_ready_handlers(ctx, &ready_list); + progress |= aio_dispatch_ready_handlers(ctx, &ready_list, block_ns); aio_free_deleted_handlers(ctx); From patchwork Tue Mar 11 16:00:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 41FCBC28B2F for ; Tue, 11 Mar 2025 16:10:17 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts26v-0000jS-4M; Tue, 11 Mar 2025 12:05:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22G-00046r-KM for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22D-0005jf-U9 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708849; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=z7aF7WG2siRPPE1Mevf105k+h5zQqWwSgkv9iiPVJ6E=; b=UzVNMOJ4RV6RKYJsAyMMB47BtMrlvhHI0UPJNc7q2L8J5+CgX2UgdoqUiyQoWwc78BbvKM YC36J3xK954ngHEzdXjbvkhZQwNiDgEWojSTG6HqSXiZbFw+OjQhNjT5vsHeDyOhyg5Psd jK/adf9Y/i77bJ8zV1vLuHkoU5JwnhQ= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-459-az6_2gRWOdqiPyqgexgP3w-1; Tue, 11 Mar 2025 12:00:45 -0400 X-MC-Unique: az6_2gRWOdqiPyqgexgP3w-1 X-Mimecast-MFC-AGG-ID: az6_2gRWOdqiPyqgexgP3w_1741708844 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B53691954B39; Tue, 11 Mar 2025 16:00:44 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2B690180AF7C; Tue, 11 Mar 2025 16:00:42 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 09/22] iotests: Limit qsd-migrate to working formats Date: Tue, 11 Mar 2025 17:00:08 +0100 Message-ID: <20250311160021.349761-10-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Thomas Huth qsd-migrate is currently only working for raw, qcow2 and qed. Other formats are failing, e.g. because they don't support migration. Thus let's limit this test to the three usable formats now. Suggested-by: Kevin Wolf Signed-off-by: Thomas Huth Message-ID: <20250224214058.205889-1-thuth@redhat.com> Reviewed-by: Kevin Wolf Signed-off-by: Kevin Wolf --- tests/qemu-iotests/tests/qsd-migrate | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/tests/qsd-migrate b/tests/qemu-iotests/tests/qsd-migrate index de17562cb0..a4c6592420 100755 --- a/tests/qemu-iotests/tests/qsd-migrate +++ b/tests/qemu-iotests/tests/qsd-migrate @@ -22,7 +22,7 @@ import iotests from iotests import filter_qemu_io, filter_qtest -iotests.script_initialize(supported_fmts=['generic'], +iotests.script_initialize(supported_fmts=['qcow2', 'qed', 'raw'], supported_protocols=['file'], supported_platforms=['linux']) From patchwork Tue Mar 11 16:00:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 41A06C282EC for ; Tue, 11 Mar 2025 16:14:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2BO-0005Ad-B6; Tue, 11 Mar 2025 12:10:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22K-0004FR-2K for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:00 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22H-0005jw-8E for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708850; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o3M34tbCgzDrec1OhUc9/W+zRbMhY9lNOXZ99ApbkyM=; b=Pjq+i47T1lnZ5TCX+E+n6Pp2Ab73uTnVbEx1ZFgzyuP1UdvzmUu80EL/zH+uTYf5JofIFP q1oQOxw9nKWd9ka478e7op+fjnSa2b5xjAkVq01XFIBluIX0CTI52Tiuu48beKziv2rI1e df/Fqr/16Cs+poC7Z/qs3UtSUrMVYXI= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-424-Uj0B-xfpOSqm19e4tkvLTA-1; Tue, 11 Mar 2025 12:00:48 -0400 X-MC-Unique: Uj0B-xfpOSqm19e4tkvLTA-1 X-Mimecast-MFC-AGG-ID: Uj0B-xfpOSqm19e4tkvLTA_1741708846 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D36751954B33; Tue, 11 Mar 2025 16:00:46 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 34BE21800373; Tue, 11 Mar 2025 16:00:44 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 10/22] scsi-disk: drop unused SCSIDiskState->bh field Date: Tue, 11 Mar 2025 17:00:09 +0100 Message-ID: <20250311160021.349761-11-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi Commit 71544d30a6f8 ("scsi: push request restart to SCSIDevice") removed the only user of SCSIDiskState->bh. Signed-off-by: Stefan Hajnoczi Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-2-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- hw/scsi/scsi-disk.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index e7f738b484..caf6c1437f 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -106,7 +106,6 @@ struct SCSIDiskState { uint64_t max_unmap_size; uint64_t max_io_size; uint32_t quirks; - QEMUBH *bh; char *version; char *serial; char *vendor; From patchwork Tue Mar 11 16:00:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6927EC282EC for ; Tue, 11 Mar 2025 16:07:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts27S-0001BY-1Z; Tue, 11 Mar 2025 12:06:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22K-0004Fi-UB for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:01 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22G-0005kD-Cv for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:00:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708851; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uCCNj0NZUm4NtjFxz2ZnD0eWZpqRsgge4WXRpitU6js=; b=NOIU3DA0hHPMLuzwePpL8FqzzGqKAtfud5VBYaI751/DNCNwSzEjusOaxokj142Gg38yzi TIJYrKt1ikRRDI0FgZ/RL3wWMrEKX/w7lhiUM+Uaf9EJYvAPE2un+aEVe6VwS7waN0GE0F gDUEUGv9P2jcayki7GaKQ4wXh72vuNU= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-138-p2PV28-eNliYa3O_Ohxksg-1; Tue, 11 Mar 2025 12:00:50 -0400 X-MC-Unique: p2PV28-eNliYa3O_Ohxksg-1 X-Mimecast-MFC-AGG-ID: p2PV28-eNliYa3O_Ohxksg_1741708849 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 040EB180AF68; Tue, 11 Mar 2025 16:00:49 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5D86818001E9; Tue, 11 Mar 2025 16:00:47 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 11/22] dma: use current AioContext for dma_blk_io() Date: Tue, 11 Mar 2025 17:00:10 +0100 Message-ID: <20250311160021.349761-12-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi In the past a single AioContext was used for block I/O and it was fetched using blk_get_aio_context(). Nowadays the block layer supports running I/O from any AioContext and multiple AioContexts at the same time. Remove the dma_blk_io() AioContext argument and use the current AioContext instead. This makes calling the function easier and enables multiple IOThreads to use dma_blk_io() concurrently for the same block device. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-3-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/system/dma.h | 3 +-- hw/ide/core.c | 3 +-- hw/ide/macio.c | 3 +-- hw/scsi/scsi-disk.c | 6 ++---- system/dma-helpers.c | 8 ++++---- 5 files changed, 9 insertions(+), 14 deletions(-) diff --git a/include/system/dma.h b/include/system/dma.h index 5a49a30628..e142f7efa6 100644 --- a/include/system/dma.h +++ b/include/system/dma.h @@ -290,8 +290,7 @@ typedef BlockAIOCB *DMAIOFunc(int64_t offset, QEMUIOVector *iov, BlockCompletionFunc *cb, void *cb_opaque, void *opaque); -BlockAIOCB *dma_blk_io(AioContext *ctx, - QEMUSGList *sg, uint64_t offset, uint32_t align, +BlockAIOCB *dma_blk_io(QEMUSGList *sg, uint64_t offset, uint32_t align, DMAIOFunc *io_func, void *io_func_opaque, BlockCompletionFunc *cb, void *opaque, DMADirection dir); BlockAIOCB *dma_blk_read(BlockBackend *blk, diff --git a/hw/ide/core.c b/hw/ide/core.c index f9baba59e9..b14983ec54 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -968,8 +968,7 @@ static void ide_dma_cb(void *opaque, int ret) BDRV_SECTOR_SIZE, ide_dma_cb, s); break; case IDE_DMA_TRIM: - s->bus->dma->aiocb = dma_blk_io(blk_get_aio_context(s->blk), - &s->sg, offset, BDRV_SECTOR_SIZE, + s->bus->dma->aiocb = dma_blk_io(&s->sg, offset, BDRV_SECTOR_SIZE, ide_issue_trim, s, ide_dma_cb, s, DMA_DIRECTION_TO_DEVICE); break; diff --git a/hw/ide/macio.c b/hw/ide/macio.c index 5fe764b49b..c8e8e44cc9 100644 --- a/hw/ide/macio.c +++ b/hw/ide/macio.c @@ -187,8 +187,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret) pmac_ide_transfer_cb, io); break; case IDE_DMA_TRIM: - s->bus->dma->aiocb = dma_blk_io(blk_get_aio_context(s->blk), &s->sg, - offset, 0x1, ide_issue_trim, s, + s->bus->dma->aiocb = dma_blk_io(&s->sg, offset, 0x1, ide_issue_trim, s, pmac_ide_transfer_cb, io, DMA_DIRECTION_TO_DEVICE); break; diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index caf6c1437f..f049a20275 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -487,8 +487,7 @@ static void scsi_do_read(SCSIDiskReq *r, int ret) if (r->req.sg) { dma_acct_start(s->qdev.conf.blk, &r->acct, r->req.sg, BLOCK_ACCT_READ); r->req.residual -= r->req.sg->size; - r->req.aiocb = dma_blk_io(blk_get_aio_context(s->qdev.conf.blk), - r->req.sg, r->sector << BDRV_SECTOR_BITS, + r->req.aiocb = dma_blk_io(r->req.sg, r->sector << BDRV_SECTOR_BITS, BDRV_SECTOR_SIZE, sdc->dma_readv, r, scsi_dma_complete, r, DMA_DIRECTION_FROM_DEVICE); @@ -650,8 +649,7 @@ static void scsi_write_data(SCSIRequest *req) if (r->req.sg) { dma_acct_start(s->qdev.conf.blk, &r->acct, r->req.sg, BLOCK_ACCT_WRITE); r->req.residual -= r->req.sg->size; - r->req.aiocb = dma_blk_io(blk_get_aio_context(s->qdev.conf.blk), - r->req.sg, r->sector << BDRV_SECTOR_BITS, + r->req.aiocb = dma_blk_io(r->req.sg, r->sector << BDRV_SECTOR_BITS, BDRV_SECTOR_SIZE, sdc->dma_writev, r, scsi_dma_complete, r, DMA_DIRECTION_TO_DEVICE); diff --git a/system/dma-helpers.c b/system/dma-helpers.c index f6403242f5..6bad75876f 100644 --- a/system/dma-helpers.c +++ b/system/dma-helpers.c @@ -211,7 +211,7 @@ static const AIOCBInfo dma_aiocb_info = { .cancel_async = dma_aio_cancel, }; -BlockAIOCB *dma_blk_io(AioContext *ctx, +BlockAIOCB *dma_blk_io( QEMUSGList *sg, uint64_t offset, uint32_t align, DMAIOFunc *io_func, void *io_func_opaque, BlockCompletionFunc *cb, @@ -223,7 +223,7 @@ BlockAIOCB *dma_blk_io(AioContext *ctx, dbs->acb = NULL; dbs->sg = sg; - dbs->ctx = ctx; + dbs->ctx = qemu_get_current_aio_context(); dbs->offset = offset; dbs->align = align; dbs->sg_cur_index = 0; @@ -251,7 +251,7 @@ BlockAIOCB *dma_blk_read(BlockBackend *blk, QEMUSGList *sg, uint64_t offset, uint32_t align, void (*cb)(void *opaque, int ret), void *opaque) { - return dma_blk_io(blk_get_aio_context(blk), sg, offset, align, + return dma_blk_io(sg, offset, align, dma_blk_read_io_func, blk, cb, opaque, DMA_DIRECTION_FROM_DEVICE); } @@ -269,7 +269,7 @@ BlockAIOCB *dma_blk_write(BlockBackend *blk, QEMUSGList *sg, uint64_t offset, uint32_t align, void (*cb)(void *opaque, int ret), void *opaque) { - return dma_blk_io(blk_get_aio_context(blk), sg, offset, align, + return dma_blk_io(sg, offset, align, dma_blk_write_io_func, blk, cb, opaque, DMA_DIRECTION_TO_DEVICE); } From patchwork Tue Mar 11 16:00:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B8AC2C282EC for ; Tue, 11 Mar 2025 16:10:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2AM-0003s1-4p; Tue, 11 Mar 2025 12:09:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22P-0004HA-Tq for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:04 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22K-0005ke-19 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708853; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rskunDLsR4f93hBQb/w9k2yQjqC7TOYCWk+W/zKfZaQ=; b=MJ5TM4Jdj8JZMrGAP2TAi1gIDFx5+H508ER6coeX0YcexQ8bMi3tRuhvJpOq9C4JGmym0c D6xNo5yiPXIdKxzcVawrXEPVnsug6DzXVC74n7X84wZfKeoZMo7uq6Gk5GEFHdPHmadmdd hh5NHcb7KiTk/Ff6rmNTP5QnWq2sn3g= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-52-peeSZBZRM-mMgwAGMI7akw-1; Tue, 11 Mar 2025 12:00:51 -0400 X-MC-Unique: peeSZBZRM-mMgwAGMI7akw-1 X-Mimecast-MFC-AGG-ID: peeSZBZRM-mMgwAGMI7akw_1741708851 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 0F5E8195609E; Tue, 11 Mar 2025 16:00:51 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 7AF9C1800373; Tue, 11 Mar 2025 16:00:49 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 12/22] scsi: track per-SCSIRequest AioContext Date: Tue, 11 Mar 2025 17:00:11 +0100 Message-ID: <20250311160021.349761-13-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi Until now, a SCSIDevice's I/O requests have run in a single AioContext. In order to support multiple IOThreads it will be necessary to move to the concept of a per-SCSIRequest AioContext. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-4-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/hw/scsi/scsi.h | 1 + hw/scsi/scsi-bus.c | 1 + hw/scsi/scsi-disk.c | 17 ++++++----------- 3 files changed, 8 insertions(+), 11 deletions(-) diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h index c3d5e17e38..ffc48203f9 100644 --- a/include/hw/scsi/scsi.h +++ b/include/hw/scsi/scsi.h @@ -24,6 +24,7 @@ struct SCSIRequest { SCSIBus *bus; SCSIDevice *dev; const SCSIReqOps *ops; + AioContext *ctx; uint32_t refcount; uint32_t tag; uint32_t lun; diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index 7d4546800f..846bbbf0ec 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -868,6 +868,7 @@ invalid_opcode: } } + req->ctx = qemu_get_current_aio_context(); req->cmd = cmd; req->residual = req->cmd.xfer; diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index f049a20275..7cf8c31b98 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -328,9 +328,8 @@ static void scsi_aio_complete(void *opaque, int ret) SCSIDiskReq *r = (SCSIDiskReq *)opaque; SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev); - /* The request must only run in the BlockBackend's AioContext */ - assert(blk_get_aio_context(s->qdev.conf.blk) == - qemu_get_current_aio_context()); + /* The request must run in its AioContext */ + assert(r->req.ctx == qemu_get_current_aio_context()); assert(r->req.aiocb != NULL); r->req.aiocb = NULL; @@ -430,12 +429,10 @@ static void scsi_dma_complete(void *opaque, int ret) static void scsi_read_complete_noio(SCSIDiskReq *r, int ret) { - SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev); uint32_t n; - /* The request must only run in the BlockBackend's AioContext */ - assert(blk_get_aio_context(s->qdev.conf.blk) == - qemu_get_current_aio_context()); + /* The request must run in its AioContext */ + assert(r->req.ctx == qemu_get_current_aio_context()); assert(r->req.aiocb == NULL); if (scsi_disk_req_check_error(r, ret, ret > 0)) { @@ -562,12 +559,10 @@ static void scsi_read_data(SCSIRequest *req) static void scsi_write_complete_noio(SCSIDiskReq *r, int ret) { - SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev); uint32_t n; - /* The request must only run in the BlockBackend's AioContext */ - assert(blk_get_aio_context(s->qdev.conf.blk) == - qemu_get_current_aio_context()); + /* The request must run in its AioContext */ + assert(r->req.ctx == qemu_get_current_aio_context()); assert (r->req.aiocb == NULL); if (scsi_disk_req_check_error(r, ret, ret > 0)) { From patchwork Tue Mar 11 16:00:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 59DFDC282EC for ; Tue, 11 Mar 2025 16:10:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts29H-0003AE-PC; Tue, 11 Mar 2025 12:08:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22P-0004G5-3S for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:04 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22L-0005l7-L0 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708855; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a3kypTgkGT2A6onVmLd8E/BwwRQqe5qOoW8CUGpL3RA=; b=ABAviioYxqqsS6KZ8VaIHWtKXThYxRWWZZyTciQLq5XbU0ftW5/hFqz/ISpdgmzMPFw2z7 ++WQaI2BoBluQyfN3UFytscX1qBdDsYon7hEKBN/tKiPzXBfD+KDH3YR7GaJy1u2cuCpOU +lKKX5xXM9E8rUr1qj0rLA2RxPancsc= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-84-etyCIIT3MeCeuz9xNGWb6w-1; Tue, 11 Mar 2025 12:00:54 -0400 X-MC-Unique: etyCIIT3MeCeuz9xNGWb6w-1 X-Mimecast-MFC-AGG-ID: etyCIIT3MeCeuz9xNGWb6w_1741708853 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 14D37180AF57; Tue, 11 Mar 2025 16:00:53 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8A0471800373; Tue, 11 Mar 2025 16:00:51 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 13/22] scsi: introduce requests_lock Date: Tue, 11 Mar 2025 17:00:12 +0100 Message-ID: <20250311160021.349761-14-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi SCSIDevice keeps track of in-flight requests for device reset and Task Management Functions (TMFs). The request list requires protection so that multi-threaded SCSI emulation can be implemented in commits that follow. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-5-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/hw/scsi/scsi.h | 7 ++- hw/scsi/scsi-bus.c | 120 +++++++++++++++++++++++++++++------------ 2 files changed, 88 insertions(+), 39 deletions(-) diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h index ffc48203f9..90ee192b4d 100644 --- a/include/hw/scsi/scsi.h +++ b/include/hw/scsi/scsi.h @@ -49,6 +49,8 @@ struct SCSIRequest { bool dma_started; BlockAIOCB *aiocb; QEMUSGList *sg; + + /* Protected by SCSIDevice->requests_lock */ QTAILQ_ENTRY(SCSIRequest) next; }; @@ -77,10 +79,7 @@ struct SCSIDevice uint8_t sense[SCSI_SENSE_BUF_SIZE]; uint32_t sense_len; - /* - * The requests list is only accessed from the AioContext that executes - * requests or from the main loop when IOThread processing is stopped. - */ + QemuMutex requests_lock; /* protects the requests list */ QTAILQ_HEAD(, SCSIRequest) requests; uint32_t channel; diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index 846bbbf0ec..ece1107ee8 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -100,8 +100,15 @@ static void scsi_device_for_each_req_sync(SCSIDevice *s, assert(!runstate_is_running()); assert(qemu_in_main_thread()); - QTAILQ_FOREACH_SAFE(req, &s->requests, next, next_req) { - fn(req, opaque); + /* + * Locking is not necessary because the guest is stopped and no other + * threads can be accessing the requests list, but take the lock for + * consistency. + */ + WITH_QEMU_LOCK_GUARD(&s->requests_lock) { + QTAILQ_FOREACH_SAFE(req, &s->requests, next, next_req) { + fn(req, opaque); + } } } @@ -115,21 +122,29 @@ static void scsi_device_for_each_req_async_bh(void *opaque) { g_autofree SCSIDeviceForEachReqAsyncData *data = opaque; SCSIDevice *s = data->s; - AioContext *ctx; - SCSIRequest *req; - SCSIRequest *next; + g_autoptr(GList) reqs = NULL; /* - * The BB cannot have changed contexts between this BH being scheduled and - * now: BBs' AioContexts, when they have a node attached, can only be - * changed via bdrv_try_change_aio_context(), in a drained section. While - * we have the in-flight counter incremented, that drain must block. + * Build a list of requests in this AioContext so fn() can be invoked later + * outside requests_lock. */ - ctx = blk_get_aio_context(s->conf.blk); - assert(ctx == qemu_get_current_aio_context()); + WITH_QEMU_LOCK_GUARD(&s->requests_lock) { + AioContext *ctx = qemu_get_current_aio_context(); + SCSIRequest *req; + SCSIRequest *next; + + QTAILQ_FOREACH_SAFE(req, &s->requests, next, next) { + if (req->ctx == ctx) { + scsi_req_ref(req); /* dropped after calling fn() */ + reqs = g_list_prepend(reqs, req); + } + } + } - QTAILQ_FOREACH_SAFE(req, &s->requests, next, next) { - data->fn(req, data->fn_opaque); + /* Call fn() on each request */ + for (GList *elem = g_list_first(reqs); elem; elem = g_list_next(elem)) { + data->fn(elem->data, data->fn_opaque); + scsi_req_unref(elem->data); } /* Drop the reference taken by scsi_device_for_each_req_async() */ @@ -139,9 +154,35 @@ static void scsi_device_for_each_req_async_bh(void *opaque) blk_dec_in_flight(s->conf.blk); } +static void scsi_device_for_each_req_async_do_ctx(gpointer key, gpointer value, + gpointer user_data) +{ + AioContext *ctx = key; + SCSIDeviceForEachReqAsyncData *params = user_data; + SCSIDeviceForEachReqAsyncData *data; + + data = g_new(SCSIDeviceForEachReqAsyncData, 1); + data->s = params->s; + data->fn = params->fn; + data->fn_opaque = params->fn_opaque; + + /* + * Hold a reference to the SCSIDevice until + * scsi_device_for_each_req_async_bh() finishes. + */ + object_ref(OBJECT(data->s)); + + /* Paired with scsi_device_for_each_req_async_bh() */ + blk_inc_in_flight(data->s->conf.blk); + + aio_bh_schedule_oneshot(ctx, scsi_device_for_each_req_async_bh, data); +} + /* * Schedule @fn() to be invoked for each enqueued request in device @s. @fn() - * runs in the AioContext that is executing the request. + * must be thread-safe because it runs concurrently in each AioContext that is + * executing a request. + * * Keeps the BlockBackend's in-flight counter incremented until everything is * done, so draining it will settle all scheduled @fn() calls. */ @@ -151,24 +192,26 @@ static void scsi_device_for_each_req_async(SCSIDevice *s, { assert(qemu_in_main_thread()); - SCSIDeviceForEachReqAsyncData *data = - g_new(SCSIDeviceForEachReqAsyncData, 1); - - data->s = s; - data->fn = fn; - data->fn_opaque = opaque; - - /* - * Hold a reference to the SCSIDevice until - * scsi_device_for_each_req_async_bh() finishes. - */ - object_ref(OBJECT(s)); + /* The set of AioContexts where the requests are being processed */ + g_autoptr(GHashTable) aio_contexts = g_hash_table_new(NULL, NULL); + WITH_QEMU_LOCK_GUARD(&s->requests_lock) { + SCSIRequest *req; + QTAILQ_FOREACH(req, &s->requests, next) { + g_hash_table_add(aio_contexts, req->ctx); + } + } - /* Paired with blk_dec_in_flight() in scsi_device_for_each_req_async_bh() */ - blk_inc_in_flight(s->conf.blk); - aio_bh_schedule_oneshot(blk_get_aio_context(s->conf.blk), - scsi_device_for_each_req_async_bh, - data); + /* Schedule a BH for each AioContext */ + SCSIDeviceForEachReqAsyncData params = { + .s = s, + .fn = fn, + .fn_opaque = opaque, + }; + g_hash_table_foreach( + aio_contexts, + scsi_device_for_each_req_async_do_ctx, + ¶ms + ); } static void scsi_device_realize(SCSIDevice *s, Error **errp) @@ -349,6 +392,7 @@ static void scsi_qdev_realize(DeviceState *qdev, Error **errp) dev->lun = lun; } + qemu_mutex_init(&dev->requests_lock); QTAILQ_INIT(&dev->requests); scsi_device_realize(dev, &local_err); if (local_err) { @@ -369,6 +413,8 @@ static void scsi_qdev_unrealize(DeviceState *qdev) scsi_device_purge_requests(dev, SENSE_CODE(NO_SENSE)); + qemu_mutex_destroy(&dev->requests_lock); + scsi_device_unrealize(dev); blockdev_mark_auto_del(dev->conf.blk); @@ -965,7 +1011,10 @@ static void scsi_req_enqueue_internal(SCSIRequest *req) req->sg = NULL; } req->enqueued = true; - QTAILQ_INSERT_TAIL(&req->dev->requests, req, next); + + WITH_QEMU_LOCK_GUARD(&req->dev->requests_lock) { + QTAILQ_INSERT_TAIL(&req->dev->requests, req, next); + } } int32_t scsi_req_enqueue(SCSIRequest *req) @@ -985,7 +1034,9 @@ static void scsi_req_dequeue(SCSIRequest *req) trace_scsi_req_dequeue(req->dev->id, req->lun, req->tag); req->retry = false; if (req->enqueued) { - QTAILQ_REMOVE(&req->dev->requests, req, next); + WITH_QEMU_LOCK_GUARD(&req->dev->requests_lock) { + QTAILQ_REMOVE(&req->dev->requests, req, next); + } req->enqueued = false; scsi_req_unref(req); } @@ -1962,8 +2013,7 @@ static void scsi_device_class_init(ObjectClass *klass, void *data) static void scsi_dev_instance_init(Object *obj) { - DeviceState *dev = DEVICE(obj); - SCSIDevice *s = SCSI_DEVICE(dev); + SCSIDevice *s = SCSI_DEVICE(obj); device_add_bootindex_property(obj, &s->conf.bootindex, "bootindex", NULL, From patchwork Tue Mar 11 16:00:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012266 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E958AC282EC for ; Tue, 11 Mar 2025 16:12:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2Cy-000832-84; Tue, 11 Mar 2025 12:11:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22V-0004PD-Pn for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22P-0005m5-JZ for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708860; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s4IZO3vwmSLcmgRxaZNwXSzmWEfgubyuOVHXMZ+T5f0=; b=RKi+q9A7QeH+pqJFAMaqBWW+vDOoerMw/7R/A5+F3lyJiU6rx105U7WwK8ArrwH3O1MkRE KGkqBbtYVpzaHA2vcMzJGwN9JGylN1KpInQXV6Z9ulPkSihJFfT8LYwRFnEU665PXy2yCi 25jXTioHtCkEujmNXZHb3NOzBFgoezg= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-647-LCLZ9DpWPHupbIx4eagXKA-1; Tue, 11 Mar 2025 12:00:56 -0400 X-MC-Unique: LCLZ9DpWPHupbIx4eagXKA-1 X-Mimecast-MFC-AGG-ID: LCLZ9DpWPHupbIx4eagXKA_1741708855 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1CDFA180882E; Tue, 11 Mar 2025 16:00:55 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8A52B1800373; Tue, 11 Mar 2025 16:00:53 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 14/22] virtio-scsi: introduce event and ctrl virtqueue locks Date: Tue, 11 Mar 2025 17:00:13 +0100 Message-ID: <20250311160021.349761-15-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi Virtqueues are not thread-safe. Until now this was not a major issue since all virtqueue processing happened in the same thread. The ctrl queue's Task Management Function (TMF) requests sometimes need the main loop, so a BH was used to schedule the virtqueue completion back in the thread that has virtqueue access. When IOThread Virtqueue Mapping is introduced in later commits, event and ctrl virtqueue accesses from other threads will become necessary. Introduce an optional per-virtqueue lock so the event and ctrl virtqueues can be protected in the commits that follow. The addition of the ctrl virtqueue lock makes virtio_scsi_complete_req_from_main_loop() and its BH unnecessary. Instead, take the ctrl virtqueue lock from the main loop thread. The cmd virtqueue does not have a lock because the entirety of SCSI command processing happens in one thread. Only one thread accesses the cmd virtqueue and a lock is unnecessary. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-6-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/hw/virtio/virtio-scsi.h | 3 ++ hw/scsi/virtio-scsi.c | 84 ++++++++++++++++++--------------- 2 files changed, 49 insertions(+), 38 deletions(-) diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index be230cd4bf..4ee98ebf63 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -84,6 +84,9 @@ struct VirtIOSCSI { int resetting; /* written from main loop thread, read from any thread */ bool events_dropped; + QemuMutex ctrl_lock; /* protects ctrl_vq */ + QemuMutex event_lock; /* protects event_vq */ + /* * TMFs deferred to main loop BH. These fields are protected by * tmf_bh_lock. diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 7d094e1881..073ccd3d5b 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -102,13 +102,18 @@ static void virtio_scsi_free_req(VirtIOSCSIReq *req) g_free(req); } -static void virtio_scsi_complete_req(VirtIOSCSIReq *req) +static void virtio_scsi_complete_req(VirtIOSCSIReq *req, QemuMutex *vq_lock) { VirtIOSCSI *s = req->dev; VirtQueue *vq = req->vq; VirtIODevice *vdev = VIRTIO_DEVICE(s); qemu_iovec_from_buf(&req->resp_iov, 0, &req->resp, req->resp_size); + + if (vq_lock) { + qemu_mutex_lock(vq_lock); + } + virtqueue_push(vq, &req->elem, req->qsgl.size + req->resp_iov.size); if (s->dataplane_started && !s->dataplane_fenced) { virtio_notify_irqfd(vdev, vq); @@ -116,6 +121,10 @@ static void virtio_scsi_complete_req(VirtIOSCSIReq *req) virtio_notify(vdev, vq); } + if (vq_lock) { + qemu_mutex_unlock(vq_lock); + } + if (req->sreq) { req->sreq->hba_private = NULL; scsi_req_unref(req->sreq); @@ -123,34 +132,20 @@ static void virtio_scsi_complete_req(VirtIOSCSIReq *req) virtio_scsi_free_req(req); } -static void virtio_scsi_complete_req_bh(void *opaque) +static void virtio_scsi_bad_req(VirtIOSCSIReq *req, QemuMutex *vq_lock) { - VirtIOSCSIReq *req = opaque; + virtio_error(VIRTIO_DEVICE(req->dev), "wrong size for virtio-scsi headers"); - virtio_scsi_complete_req(req); -} + if (vq_lock) { + qemu_mutex_lock(vq_lock); + } -/* - * Called from virtio_scsi_do_one_tmf_bh() in main loop thread. The main loop - * thread cannot touch the virtqueue since that could race with an IOThread. - */ -static void virtio_scsi_complete_req_from_main_loop(VirtIOSCSIReq *req) -{ - VirtIOSCSI *s = req->dev; + virtqueue_detach_element(req->vq, &req->elem, 0); - if (!s->ctx || s->ctx == qemu_get_aio_context()) { - /* No need to schedule a BH when there is no IOThread */ - virtio_scsi_complete_req(req); - } else { - /* Run request completion in the IOThread */ - aio_wait_bh_oneshot(s->ctx, virtio_scsi_complete_req_bh, req); + if (vq_lock) { + qemu_mutex_unlock(vq_lock); } -} -static void virtio_scsi_bad_req(VirtIOSCSIReq *req) -{ - virtio_error(VIRTIO_DEVICE(req->dev), "wrong size for virtio-scsi headers"); - virtqueue_detach_element(req->vq, &req->elem, 0); virtio_scsi_free_req(req); } @@ -235,12 +230,21 @@ static int virtio_scsi_parse_req(VirtIOSCSIReq *req, return 0; } -static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, VirtQueue *vq) +static VirtIOSCSIReq *virtio_scsi_pop_req(VirtIOSCSI *s, VirtQueue *vq, QemuMutex *vq_lock) { VirtIOSCSICommon *vs = (VirtIOSCSICommon *)s; VirtIOSCSIReq *req; + if (vq_lock) { + qemu_mutex_lock(vq_lock); + } + req = virtqueue_pop(vq, sizeof(VirtIOSCSIReq) + vs->cdb_size); + + if (vq_lock) { + qemu_mutex_unlock(vq_lock); + } + if (!req) { return NULL; } @@ -305,7 +309,7 @@ static void virtio_scsi_cancel_notify(Notifier *notifier, void *data) trace_virtio_scsi_tmf_resp(virtio_scsi_get_lun(req->req.tmf.lun), req->req.tmf.tag, req->resp.tmf.response); - virtio_scsi_complete_req(req); + virtio_scsi_complete_req(req, &req->dev->ctrl_lock); } g_free(n); } @@ -361,7 +365,7 @@ static void virtio_scsi_do_one_tmf_bh(VirtIOSCSIReq *req) out: object_unref(OBJECT(d)); - virtio_scsi_complete_req_from_main_loop(req); + virtio_scsi_complete_req(req, &s->ctrl_lock); } /* Some TMFs must be processed from the main loop thread */ @@ -408,7 +412,7 @@ static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s) /* SAM-6 6.3.2 Hard reset */ req->resp.tmf.response = VIRTIO_SCSI_S_TARGET_FAILURE; - virtio_scsi_complete_req(req); + virtio_scsi_complete_req(req, &req->dev->ctrl_lock); } } @@ -562,7 +566,7 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req) if (iov_to_buf(req->elem.out_sg, req->elem.out_num, 0, &type, sizeof(type)) < sizeof(type)) { - virtio_scsi_bad_req(req); + virtio_scsi_bad_req(req, &s->ctrl_lock); return; } @@ -570,7 +574,7 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req) if (type == VIRTIO_SCSI_T_TMF) { if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlTMFReq), sizeof(VirtIOSCSICtrlTMFResp)) < 0) { - virtio_scsi_bad_req(req); + virtio_scsi_bad_req(req, &s->ctrl_lock); return; } else { r = virtio_scsi_do_tmf(s, req); @@ -580,7 +584,7 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req) type == VIRTIO_SCSI_T_AN_SUBSCRIBE) { if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlANReq), sizeof(VirtIOSCSICtrlANResp)) < 0) { - virtio_scsi_bad_req(req); + virtio_scsi_bad_req(req, &s->ctrl_lock); return; } else { req->req.an.event_requested = @@ -600,7 +604,7 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req) type == VIRTIO_SCSI_T_AN_SUBSCRIBE) trace_virtio_scsi_an_resp(virtio_scsi_get_lun(req->req.an.lun), req->resp.an.response); - virtio_scsi_complete_req(req); + virtio_scsi_complete_req(req, &s->ctrl_lock); } else { assert(r == -EINPROGRESS); } @@ -610,7 +614,7 @@ static void virtio_scsi_handle_ctrl_vq(VirtIOSCSI *s, VirtQueue *vq) { VirtIOSCSIReq *req; - while ((req = virtio_scsi_pop_req(s, vq))) { + while ((req = virtio_scsi_pop_req(s, vq, &s->ctrl_lock))) { virtio_scsi_handle_ctrl_req(s, req); } } @@ -654,7 +658,7 @@ static void virtio_scsi_complete_cmd_req(VirtIOSCSIReq *req) * in virtio_scsi_command_complete. */ req->resp_size = sizeof(VirtIOSCSICmdResp); - virtio_scsi_complete_req(req); + virtio_scsi_complete_req(req, NULL); } static void virtio_scsi_command_failed(SCSIRequest *r) @@ -788,7 +792,7 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req) virtio_scsi_fail_cmd_req(req); return -ENOTSUP; } else { - virtio_scsi_bad_req(req); + virtio_scsi_bad_req(req, NULL); return -EINVAL; } } @@ -843,7 +847,7 @@ static void virtio_scsi_handle_cmd_vq(VirtIOSCSI *s, VirtQueue *vq) virtio_queue_set_notification(vq, 0); } - while ((req = virtio_scsi_pop_req(s, vq))) { + while ((req = virtio_scsi_pop_req(s, vq, NULL))) { ret = virtio_scsi_handle_cmd_req_prepare(s, req); if (!ret) { QTAILQ_INSERT_TAIL(&reqs, req, next); @@ -973,7 +977,7 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, return; } - req = virtio_scsi_pop_req(s, vs->event_vq); + req = virtio_scsi_pop_req(s, vs->event_vq, &s->event_lock); if (!req) { s->events_dropped = true; return; @@ -985,7 +989,7 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, } if (virtio_scsi_parse_req(req, 0, sizeof(VirtIOSCSIEvent))) { - virtio_scsi_bad_req(req); + virtio_scsi_bad_req(req, &s->event_lock); return; } @@ -1005,7 +1009,7 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, } trace_virtio_scsi_event(virtio_scsi_get_lun(evt->lun), event, reason); - virtio_scsi_complete_req(req); + virtio_scsi_complete_req(req, &s->event_lock); } static void virtio_scsi_handle_event_vq(VirtIOSCSI *s, VirtQueue *vq) @@ -1236,6 +1240,8 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp) Error *err = NULL; QTAILQ_INIT(&s->tmf_bh_list); + qemu_mutex_init(&s->ctrl_lock); + qemu_mutex_init(&s->event_lock); qemu_mutex_init(&s->tmf_bh_lock); virtio_scsi_common_realize(dev, @@ -1280,6 +1286,8 @@ static void virtio_scsi_device_unrealize(DeviceState *dev) qbus_set_hotplug_handler(BUS(&s->bus), NULL); virtio_scsi_common_unrealize(dev); qemu_mutex_destroy(&s->tmf_bh_lock); + qemu_mutex_destroy(&s->event_lock); + qemu_mutex_destroy(&s->ctrl_lock); } static const Property virtio_scsi_properties[] = { From patchwork Tue Mar 11 16:00:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012256 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 264A4C282EC for ; Tue, 11 Mar 2025 16:10:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2AC-0003lX-Rw; Tue, 11 Mar 2025 12:09:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22b-0004Pf-PI for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22S-0005mS-2a for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708861; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1SvDv+oKK8AEvt8aEvHnDL9caSTSbyYy2fmRvxxoh7g=; b=OSpElQdQWg5m6HWUUYxirQu2gz/nVbOjwzMMNlwbAzrFU/h+rK7TKzIzC5p1PpLEQ6k8jp 3s/9A+XK4/0SVK412FTtRdbBAGdR8F8uXeChtrNneimkBbwBaKH41JES+YEQedlYfBi7gN P/Eyaomk1ncpJPxPLKOF6MhqS1eZFkE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-530-XX45UFF2Ouyl_ajfe8s6WA-1; Tue, 11 Mar 2025 12:00:58 -0400 X-MC-Unique: XX45UFF2Ouyl_ajfe8s6WA-1 X-Mimecast-MFC-AGG-ID: XX45UFF2Ouyl_ajfe8s6WA_1741708857 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1FB14180AF5E; Tue, 11 Mar 2025 16:00:57 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 920FA180AF7C; Tue, 11 Mar 2025 16:00:55 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 15/22] virtio-scsi: protect events_dropped field Date: Tue, 11 Mar 2025 17:00:14 +0100 Message-ID: <20250311160021.349761-16-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi The block layer can invoke the resize callback from any AioContext that is processing requests. The virtqueue is already protected but the events_dropped field also needs to be protected against races. Cover it using the event virtqueue lock because it is closely associated with accesses to the virtqueue. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-7-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/hw/virtio/virtio-scsi.h | 3 ++- hw/scsi/virtio-scsi.c | 29 ++++++++++++++++++++--------- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index 4ee98ebf63..7b7e3ced7a 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -82,10 +82,11 @@ struct VirtIOSCSI { SCSIBus bus; int resetting; /* written from main loop thread, read from any thread */ + + QemuMutex event_lock; /* protects event_vq and events_dropped */ bool events_dropped; QemuMutex ctrl_lock; /* protects ctrl_vq */ - QemuMutex event_lock; /* protects event_vq */ /* * TMFs deferred to main loop BH. These fields are protected by diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 073ccd3d5b..2d796a861b 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -948,7 +948,10 @@ static void virtio_scsi_reset(VirtIODevice *vdev) vs->sense_size = VIRTIO_SCSI_SENSE_DEFAULT_SIZE; vs->cdb_size = VIRTIO_SCSI_CDB_DEFAULT_SIZE; - s->events_dropped = false; + + WITH_QEMU_LOCK_GUARD(&s->event_lock) { + s->events_dropped = false; + } } typedef struct { @@ -978,14 +981,16 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, } req = virtio_scsi_pop_req(s, vs->event_vq, &s->event_lock); - if (!req) { - s->events_dropped = true; - return; - } + WITH_QEMU_LOCK_GUARD(&s->event_lock) { + if (!req) { + s->events_dropped = true; + return; + } - if (s->events_dropped) { - event |= VIRTIO_SCSI_T_EVENTS_MISSED; - s->events_dropped = false; + if (s->events_dropped) { + event |= VIRTIO_SCSI_T_EVENTS_MISSED; + s->events_dropped = false; + } } if (virtio_scsi_parse_req(req, 0, sizeof(VirtIOSCSIEvent))) { @@ -1014,7 +1019,13 @@ static void virtio_scsi_push_event(VirtIOSCSI *s, static void virtio_scsi_handle_event_vq(VirtIOSCSI *s, VirtQueue *vq) { - if (s->events_dropped) { + bool events_dropped; + + WITH_QEMU_LOCK_GUARD(&s->event_lock) { + events_dropped = s->events_dropped; + } + + if (events_dropped) { VirtIOSCSIEventInfo info = { .event = VIRTIO_SCSI_T_NO_EVENT, }; From patchwork Tue Mar 11 16:00:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012269 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30F63C282EC for ; Tue, 11 Mar 2025 16:15:33 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2Bs-0006Kq-3j; Tue, 11 Mar 2025 12:10:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22g-0004WI-6u for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22V-0005mr-IV for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708863; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yJYBcyV9FB7KoeM9H7v85V0FLu9xdvUBGH+DjnkO3x4=; b=GB/c0XQSd+fQ3MkqrbjWu5QQhruBaKjDk6f7/kKywzrUIAAAAazt+QWeM+zJ6xDuvxp9xC cEtfdAXcDi1eXrs1+89j8LkgzaIOSonqNs/WOsuihoGZvRqr9+H/hHfRN2G110jBmuWKDZ oHMb/AmTJ0TuiXt/ezdmLjGNNzJYnBw= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-43-P1giimfeOz6czHuKdlHfyg-1; Tue, 11 Mar 2025 12:01:00 -0400 X-MC-Unique: P1giimfeOz6czHuKdlHfyg-1 X-Mimecast-MFC-AGG-ID: P1giimfeOz6czHuKdlHfyg_1741708859 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4836C19560B7; Tue, 11 Mar 2025 16:00:59 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 953051800373; Tue, 11 Mar 2025 16:00:57 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 16/22] virtio-scsi: perform TMFs in appropriate AioContexts Date: Tue, 11 Mar 2025 17:00:15 +0100 Message-ID: <20250311160021.349761-17-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi With IOThread Virtqueue Mapping there will be multiple AioContexts processing SCSI requests. scsi_req_cancel() and other SCSI request operations must be performed from the AioContext where the request is running. Introduce a virtio_scsi_defer_tmf_to_aio_context() function and the necessary VirtIOSCSIReq->remaining refcount infrastructure to move the TMF code into the AioContext where the request is running. For the time being there is still just one AioContext: the main loop or the IOThread. When the iothread-vq-mapping parameter is added in a later patch this will be changed to per-virtqueue AioContexts. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-8-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- hw/scsi/virtio-scsi.c | 270 ++++++++++++++++++++++++++++++++---------- 1 file changed, 206 insertions(+), 64 deletions(-) diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 2d796a861b..2045d27289 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -47,7 +47,7 @@ typedef struct VirtIOSCSIReq { /* Used for two-stage request submission and TMFs deferred to BH */ QTAILQ_ENTRY(VirtIOSCSIReq) next; - /* Used for cancellation of request during TMFs */ + /* Used for cancellation of request during TMFs. Atomic. */ int remaining; SCSIRequest *sreq; @@ -298,19 +298,23 @@ typedef struct { VirtIOSCSIReq *tmf_req; } VirtIOSCSICancelNotifier; +static void virtio_scsi_tmf_dec_remaining(VirtIOSCSIReq *tmf) +{ + if (qatomic_fetch_dec(&tmf->remaining) == 1) { + trace_virtio_scsi_tmf_resp(virtio_scsi_get_lun(tmf->req.tmf.lun), + tmf->req.tmf.tag, tmf->resp.tmf.response); + + virtio_scsi_complete_req(tmf, &tmf->dev->ctrl_lock); + } +} + static void virtio_scsi_cancel_notify(Notifier *notifier, void *data) { VirtIOSCSICancelNotifier *n = container_of(notifier, VirtIOSCSICancelNotifier, notifier); - if (--n->tmf_req->remaining == 0) { - VirtIOSCSIReq *req = n->tmf_req; - - trace_virtio_scsi_tmf_resp(virtio_scsi_get_lun(req->req.tmf.lun), - req->req.tmf.tag, req->resp.tmf.response); - virtio_scsi_complete_req(req, &req->dev->ctrl_lock); - } + virtio_scsi_tmf_dec_remaining(n->tmf_req); g_free(n); } @@ -416,7 +420,7 @@ static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s) } } -static void virtio_scsi_defer_tmf_to_bh(VirtIOSCSIReq *req) +static void virtio_scsi_defer_tmf_to_main_loop(VirtIOSCSIReq *req) { VirtIOSCSI *s = req->dev; @@ -430,6 +434,137 @@ static void virtio_scsi_defer_tmf_to_bh(VirtIOSCSIReq *req) } } +static void virtio_scsi_tmf_cancel_req(VirtIOSCSIReq *tmf, SCSIRequest *r) +{ + VirtIOSCSICancelNotifier *notifier; + + assert(r->ctx == qemu_get_current_aio_context()); + + /* Decremented in virtio_scsi_cancel_notify() */ + qatomic_inc(&tmf->remaining); + + notifier = g_new(VirtIOSCSICancelNotifier, 1); + notifier->notifier.notify = virtio_scsi_cancel_notify; + notifier->tmf_req = tmf; + scsi_req_cancel_async(r, ¬ifier->notifier); +} + +/* Execute a TMF on the requests in the current AioContext */ +static void virtio_scsi_do_tmf_aio_context(void *opaque) +{ + AioContext *ctx = qemu_get_current_aio_context(); + VirtIOSCSIReq *tmf = opaque; + VirtIOSCSI *s = tmf->dev; + SCSIDevice *d = virtio_scsi_device_get(s, tmf->req.tmf.lun); + SCSIRequest *r; + bool match_tag; + + if (!d) { + tmf->resp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET; + virtio_scsi_tmf_dec_remaining(tmf); + return; + } + + /* + * This function could handle other subtypes that need to be processed in + * the request's AioContext in the future, but for now only request + * cancelation subtypes are performed here. + */ + switch (tmf->req.tmf.subtype) { + case VIRTIO_SCSI_T_TMF_ABORT_TASK: + match_tag = true; + break; + case VIRTIO_SCSI_T_TMF_ABORT_TASK_SET: + case VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET: + match_tag = false; + break; + default: + g_assert_not_reached(); + } + + WITH_QEMU_LOCK_GUARD(&d->requests_lock) { + QTAILQ_FOREACH(r, &d->requests, next) { + VirtIOSCSIReq *cmd_req = r->hba_private; + assert(cmd_req); /* request has hba_private while enqueued */ + + if (r->ctx != ctx) { + continue; + } + if (match_tag && cmd_req->req.cmd.tag != tmf->req.tmf.tag) { + continue; + } + virtio_scsi_tmf_cancel_req(tmf, r); + } + } + + /* Incremented by virtio_scsi_do_tmf() */ + virtio_scsi_tmf_dec_remaining(tmf); + + object_unref(d); +} + +static void dummy_bh(void *opaque) +{ + /* Do nothing */ +} + +/* + * Wait for pending virtio_scsi_defer_tmf_to_aio_context() BHs. + */ +static void virtio_scsi_flush_defer_tmf_to_aio_context(VirtIOSCSI *s) +{ + GLOBAL_STATE_CODE(); + + assert(!s->dataplane_started); + + if (s->ctx) { + /* Our BH only runs after previously scheduled BHs */ + aio_wait_bh_oneshot(s->ctx, dummy_bh, NULL); + } +} + +/* + * Run the TMF in a specific AioContext, handling only requests in that + * AioContext. This is necessary because requests can run in different + * AioContext and it is only possible to cancel them from the AioContext where + * they are running. + */ +static void virtio_scsi_defer_tmf_to_aio_context(VirtIOSCSIReq *tmf, + AioContext *ctx) +{ + /* Decremented in virtio_scsi_do_tmf_aio_context() */ + qatomic_inc(&tmf->remaining); + + /* See virtio_scsi_flush_defer_tmf_to_aio_context() cleanup during reset */ + aio_bh_schedule_oneshot(ctx, virtio_scsi_do_tmf_aio_context, tmf); +} + +/* + * Returns the AioContext for a given TMF's tag field or NULL. Note that the + * request identified by the tag may have completed by the time you can execute + * a BH in the AioContext, so don't assume the request still exists in your BH. + */ +static AioContext *find_aio_context_for_tmf_tag(SCSIDevice *d, + VirtIOSCSIReq *tmf) +{ + WITH_QEMU_LOCK_GUARD(&d->requests_lock) { + SCSIRequest *r; + SCSIRequest *next; + + QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) { + VirtIOSCSIReq *cmd_req = r->hba_private; + + /* hba_private is non-NULL while the request is enqueued */ + assert(cmd_req); + + if (cmd_req->req.cmd.tag == tmf->req.tmf.tag) { + return r->ctx; + } + } + } + return NULL; +} + /* Return 0 if the request is ready to be completed and return to guest; * -EINPROGRESS if the request is submitted and will be completed later, in the * case of async cancellation. */ @@ -437,6 +572,7 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req) { SCSIDevice *d = virtio_scsi_device_get(s, req->req.tmf.lun); SCSIRequest *r, *next; + AioContext *ctx; int ret = 0; virtio_scsi_ctx_check(s, d); @@ -454,52 +590,72 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req) req->req.tmf.tag, req->req.tmf.subtype); switch (req->req.tmf.subtype) { - case VIRTIO_SCSI_T_TMF_ABORT_TASK: - case VIRTIO_SCSI_T_TMF_QUERY_TASK: + case VIRTIO_SCSI_T_TMF_ABORT_TASK: { if (!d) { goto fail; } if (d->lun != virtio_scsi_get_lun(req->req.tmf.lun)) { goto incorrect_lun; } - QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) { - VirtIOSCSIReq *cmd_req = r->hba_private; - if (cmd_req && cmd_req->req.cmd.tag == req->req.tmf.tag) { - break; - } + + ctx = find_aio_context_for_tmf_tag(d, req); + if (ctx) { + virtio_scsi_defer_tmf_to_aio_context(req, ctx); + ret = -EINPROGRESS; } - if (r) { - /* - * Assert that the request has not been completed yet, we - * check for it in the loop above. - */ - assert(r->hba_private); - if (req->req.tmf.subtype == VIRTIO_SCSI_T_TMF_QUERY_TASK) { - /* "If the specified command is present in the task set, then - * return a service response set to FUNCTION SUCCEEDED". - */ - req->resp.tmf.response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED; - } else { - VirtIOSCSICancelNotifier *notifier; - - req->remaining = 1; - notifier = g_new(VirtIOSCSICancelNotifier, 1); - notifier->tmf_req = req; - notifier->notifier.notify = virtio_scsi_cancel_notify; - scsi_req_cancel_async(r, ¬ifier->notifier); - ret = -EINPROGRESS; + break; + } + + case VIRTIO_SCSI_T_TMF_QUERY_TASK: + if (!d) { + goto fail; + } + if (d->lun != virtio_scsi_get_lun(req->req.tmf.lun)) { + goto incorrect_lun; + } + + WITH_QEMU_LOCK_GUARD(&d->requests_lock) { + QTAILQ_FOREACH(r, &d->requests, next) { + VirtIOSCSIReq *cmd_req = r->hba_private; + assert(cmd_req); /* request has hba_private while enqueued */ + + if (cmd_req->req.cmd.tag == req->req.tmf.tag) { + /* + * "If the specified command is present in the task set, + * then return a service response set to FUNCTION + * SUCCEEDED". + */ + req->resp.tmf.response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED; + } } } break; case VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET: case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET: - virtio_scsi_defer_tmf_to_bh(req); + virtio_scsi_defer_tmf_to_main_loop(req); ret = -EINPROGRESS; break; case VIRTIO_SCSI_T_TMF_ABORT_TASK_SET: - case VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET: + case VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET: { + if (!d) { + goto fail; + } + if (d->lun != virtio_scsi_get_lun(req->req.tmf.lun)) { + goto incorrect_lun; + } + + qatomic_inc(&req->remaining); + + ctx = s->ctx ?: qemu_get_aio_context(); + virtio_scsi_defer_tmf_to_aio_context(req, ctx); + + virtio_scsi_tmf_dec_remaining(req); + ret = -EINPROGRESS; + break; + } + case VIRTIO_SCSI_T_TMF_QUERY_TASK_SET: if (!d) { goto fail; @@ -508,34 +664,19 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req) goto incorrect_lun; } - /* Add 1 to "remaining" until virtio_scsi_do_tmf returns. - * This way, if the bus starts calling back to the notifiers - * even before we finish the loop, virtio_scsi_cancel_notify - * will not complete the TMF too early. - */ - req->remaining = 1; - QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) { - if (r->hba_private) { - if (req->req.tmf.subtype == VIRTIO_SCSI_T_TMF_QUERY_TASK_SET) { - /* "If there is any command present in the task set, then - * return a service response set to FUNCTION SUCCEEDED". - */ - req->resp.tmf.response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED; - break; - } else { - VirtIOSCSICancelNotifier *notifier; - - req->remaining++; - notifier = g_new(VirtIOSCSICancelNotifier, 1); - notifier->notifier.notify = virtio_scsi_cancel_notify; - notifier->tmf_req = req; - scsi_req_cancel_async(r, ¬ifier->notifier); - } + WITH_QEMU_LOCK_GUARD(&d->requests_lock) { + QTAILQ_FOREACH_SAFE(r, &d->requests, next, next) { + /* Request has hba_private while enqueued */ + assert(r->hba_private); + + /* + * "If there is any command present in the task set, then + * return a service response set to FUNCTION SUCCEEDED". + */ + req->resp.tmf.response = VIRTIO_SCSI_S_FUNCTION_SUCCEEDED; + break; } } - if (--req->remaining > 0) { - ret = -EINPROGRESS; - } break; case VIRTIO_SCSI_T_TMF_CLEAR_ACA: @@ -941,6 +1082,7 @@ static void virtio_scsi_reset(VirtIODevice *vdev) assert(!s->dataplane_started); virtio_scsi_reset_tmf_bh(s); + virtio_scsi_flush_defer_tmf_to_aio_context(s); qatomic_inc(&s->resetting); bus_cold_reset(BUS(&s->bus)); From patchwork Tue Mar 11 16:00:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012265 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5D2E0C282EC for ; Tue, 11 Mar 2025 16:12:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2Cj-0007tb-Rm; Tue, 11 Mar 2025 12:11:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts235-0004WJ-11 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22V-0005nM-I5 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708865; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=b8KjjcJ8pbVd3HbdzSZOrZTAq7GE2G+9x/rrybtDQhs=; b=hUneVTfs8MN9ZNmgh45FO6g0DdOMaRgRSxXM3GISxcvzite3d86qqnX/MuxAPnYPrcfLEN t1z+Cj5N2DayIUDsZsn6UJh35Pyw3jnLqO/hlRwXFLC7FNtZ7JFs5Y0Idzq1QG2mBxaaPZ otqoVnlMX5Bp+6Et9sD4tu8i2ibBsYc= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-612-eyZE47rcPvapHXb20DK9ww-1; Tue, 11 Mar 2025 12:01:02 -0400 X-MC-Unique: eyZE47rcPvapHXb20DK9ww-1 X-Mimecast-MFC-AGG-ID: eyZE47rcPvapHXb20DK9ww_1741708861 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4CF43195609E; Tue, 11 Mar 2025 16:01:01 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id BD67118001E9; Tue, 11 Mar 2025 16:00:59 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 17/22] virtio-blk: extract cleanup_iothread_vq_mapping() function Date: Tue, 11 Mar 2025 17:00:16 +0100 Message-ID: <20250311160021.349761-18-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi This is the cleanup function that must be called after apply_iothread_vq_mapping() succeeds. virtio-scsi will need this function too, so extract it. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-9-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- hw/block/virtio-blk.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 5135b4d8f1..21b1b768ed 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1495,6 +1495,9 @@ validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list, * Fill in the AioContext for each virtqueue in the @vq_aio_context array given * the iothread-vq-mapping parameter in @iothread_vq_mapping_list. * + * cleanup_iothread_vq_mapping() must be called to free IOThread object + * references after this function returns success. + * * Returns: %true on success, %false on failure. **/ static bool apply_iothread_vq_mapping( @@ -1545,6 +1548,23 @@ static bool apply_iothread_vq_mapping( return true; } +/** + * cleanup_iothread_vq_mapping: + * @list: The mapping of virtqueues to IOThreads. + * + * Release IOThread object references that were acquired by + * apply_iothread_vq_mapping(). + */ +static void cleanup_iothread_vq_mapping(IOThreadVirtQueueMappingList *list) +{ + IOThreadVirtQueueMappingList *node; + + for (node = list; node; node = node->next) { + IOThread *iothread = iothread_by_id(node->value->iothread); + object_unref(OBJECT(iothread)); + } +} + /* Context: BQL held */ static bool virtio_blk_vq_aio_context_init(VirtIOBlock *s, Error **errp) { @@ -1611,12 +1631,7 @@ static void virtio_blk_vq_aio_context_cleanup(VirtIOBlock *s) assert(!s->ioeventfd_started); if (conf->iothread_vq_mapping_list) { - IOThreadVirtQueueMappingList *node; - - for (node = conf->iothread_vq_mapping_list; node; node = node->next) { - IOThread *iothread = iothread_by_id(node->value->iothread); - object_unref(OBJECT(iothread)); - } + cleanup_iothread_vq_mapping(conf->iothread_vq_mapping_list); } if (conf->iothread) { From patchwork Tue Mar 11 16:00:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012281 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF06BC282EC for ; Tue, 11 Mar 2025 16:18:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2DX-000172-53; Tue, 11 Mar 2025 12:12:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts235-0004WS-1M for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22e-0005nd-KQ for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708866; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S0iZGIQ6Nk+3iY08cHKIin7hpCa/OaaKfEQDf36r43A=; b=KezfEIgyOGFix6JCvmV+FyhNSmlY/MVKiPrde7HhteUtWbRxTwEVVWYsK6ws55c6NQsIPy 6irldAhJ7hpR6uqWvu930Rp90EIn4rrr6BUotGX7yEG4mVC2Y/TrS063rLbjt85BcYDpSD lclsGZKB9fw0rmVwpB898RvOn0qKFUI= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-463-a5fkUW1SMEa6pfCA8GvHfg-1; Tue, 11 Mar 2025 12:01:05 -0400 X-MC-Unique: a5fkUW1SMEa6pfCA8GvHfg-1 X-Mimecast-MFC-AGG-ID: a5fkUW1SMEa6pfCA8GvHfg_1741708864 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A2C901956070; Tue, 11 Mar 2025 16:01:03 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C2A8D1828A95; Tue, 11 Mar 2025 16:01:01 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 18/22] virtio-blk: tidy up iothread_vq_mapping functions Date: Tue, 11 Mar 2025 17:00:17 +0100 Message-ID: <20250311160021.349761-19-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi Use noun_verb() function naming instead of verb_noun() because the former is the most common naming style for APIs. The next commit will move these functions into a header file so that virtio-scsi can call them. Shorten iothread_vq_mapping_apply()'s iothread_vq_mapping_list argument to just "list" like in the other functions. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-10-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- hw/block/virtio-blk.c | 33 ++++++++++++++++----------------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 21b1b768ed..6bf7b50520 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1424,8 +1424,8 @@ static const BlockDevOps virtio_block_ops = { }; static bool -validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list, - uint16_t num_queues, Error **errp) +iothread_vq_mapping_validate(IOThreadVirtQueueMappingList *list, uint16_t + num_queues, Error **errp) { g_autofree unsigned long *vqs = bitmap_new(num_queues); g_autoptr(GHashTable) iothreads = @@ -1486,22 +1486,22 @@ validate_iothread_vq_mapping_list(IOThreadVirtQueueMappingList *list, } /** - * apply_iothread_vq_mapping: - * @iothread_vq_mapping_list: The mapping of virtqueues to IOThreads. + * iothread_vq_mapping_apply: + * @list: The mapping of virtqueues to IOThreads. * @vq_aio_context: The array of AioContext pointers to fill in. * @num_queues: The length of @vq_aio_context. * @errp: If an error occurs, a pointer to the area to store the error. * * Fill in the AioContext for each virtqueue in the @vq_aio_context array given - * the iothread-vq-mapping parameter in @iothread_vq_mapping_list. + * the iothread-vq-mapping parameter in @list. * - * cleanup_iothread_vq_mapping() must be called to free IOThread object + * iothread_vq_mapping_cleanup() must be called to free IOThread object * references after this function returns success. * * Returns: %true on success, %false on failure. **/ -static bool apply_iothread_vq_mapping( - IOThreadVirtQueueMappingList *iothread_vq_mapping_list, +static bool iothread_vq_mapping_apply( + IOThreadVirtQueueMappingList *list, AioContext **vq_aio_context, uint16_t num_queues, Error **errp) @@ -1510,16 +1510,15 @@ static bool apply_iothread_vq_mapping( size_t num_iothreads = 0; size_t cur_iothread = 0; - if (!validate_iothread_vq_mapping_list(iothread_vq_mapping_list, - num_queues, errp)) { + if (!iothread_vq_mapping_validate(list, num_queues, errp)) { return false; } - for (node = iothread_vq_mapping_list; node; node = node->next) { + for (node = list; node; node = node->next) { num_iothreads++; } - for (node = iothread_vq_mapping_list; node; node = node->next) { + for (node = list; node; node = node->next) { IOThread *iothread = iothread_by_id(node->value->iothread); AioContext *ctx = iothread_get_aio_context(iothread); @@ -1549,13 +1548,13 @@ static bool apply_iothread_vq_mapping( } /** - * cleanup_iothread_vq_mapping: + * iothread_vq_mapping_cleanup: * @list: The mapping of virtqueues to IOThreads. * * Release IOThread object references that were acquired by - * apply_iothread_vq_mapping(). + * iothread_vq_mapping_apply(). */ -static void cleanup_iothread_vq_mapping(IOThreadVirtQueueMappingList *list) +static void iothread_vq_mapping_cleanup(IOThreadVirtQueueMappingList *list) { IOThreadVirtQueueMappingList *node; @@ -1597,7 +1596,7 @@ static bool virtio_blk_vq_aio_context_init(VirtIOBlock *s, Error **errp) s->vq_aio_context = g_new(AioContext *, conf->num_queues); if (conf->iothread_vq_mapping_list) { - if (!apply_iothread_vq_mapping(conf->iothread_vq_mapping_list, + if (!iothread_vq_mapping_apply(conf->iothread_vq_mapping_list, s->vq_aio_context, conf->num_queues, errp)) { @@ -1631,7 +1630,7 @@ static void virtio_blk_vq_aio_context_cleanup(VirtIOBlock *s) assert(!s->ioeventfd_started); if (conf->iothread_vq_mapping_list) { - cleanup_iothread_vq_mapping(conf->iothread_vq_mapping_list); + iothread_vq_mapping_cleanup(conf->iothread_vq_mapping_list); } if (conf->iothread) { From patchwork Tue Mar 11 16:00:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012283 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 80BDDC282EC for ; Tue, 11 Mar 2025 16:20:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2BN-00051u-F0; Tue, 11 Mar 2025 12:10:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22p-0004X0-8r for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22i-0005oS-80 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708869; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l0ufkswK0lBCVux74WUpfHC7vHcbevViUd8jJN1L2R4=; b=cAY22unq9bdcObH6NTU9z0v0x3d1VsEdZRj73xKOb6O9w3NokoOF2zb4sRc33npdxiy6E+ phP7zKq2AWxeMuOnls+PmG5lV1/6iFl6aSm/8KoLE3qJvgcsB/Dkli04mmpRPSdD72gcYW 4/VYIv1l/dKYcEsB1Ud8RoFqkDOZ4vA= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-116-lpAiglPuP9Gpjjlh2bOSsg-1; Tue, 11 Mar 2025 12:01:07 -0400 X-MC-Unique: lpAiglPuP9Gpjjlh2bOSsg-1 X-Mimecast-MFC-AGG-ID: lpAiglPuP9Gpjjlh2bOSsg_1741708866 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1F87D18001E0; Tue, 11 Mar 2025 16:01:06 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 388FC1800373; Tue, 11 Mar 2025 16:01:03 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 19/22] virtio: extract iothread-vq-mapping.h API Date: Tue, 11 Mar 2025 17:00:18 +0100 Message-ID: <20250311160021.349761-20-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi The code that builds an array of AioContext pointers indexed by the virtqueue is not specific to virtio-blk. virtio-scsi will need to do the same thing, so extract the functions. Signed-off-by: Stefan Hajnoczi Reviewed-by: Kevin Wolf Message-ID: <20250311132616.1049687-11-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/hw/virtio/iothread-vq-mapping.h | 45 ++++++++ hw/block/virtio-blk.c | 142 +----------------------- hw/virtio/iothread-vq-mapping.c | 131 ++++++++++++++++++++++ hw/virtio/meson.build | 1 + 4 files changed, 178 insertions(+), 141 deletions(-) create mode 100644 include/hw/virtio/iothread-vq-mapping.h create mode 100644 hw/virtio/iothread-vq-mapping.c diff --git a/include/hw/virtio/iothread-vq-mapping.h b/include/hw/virtio/iothread-vq-mapping.h new file mode 100644 index 0000000000..57335c3703 --- /dev/null +++ b/include/hw/virtio/iothread-vq-mapping.h @@ -0,0 +1,45 @@ +/* + * IOThread Virtqueue Mapping + * + * Copyright Red Hat, Inc + * + * SPDX-License-Identifier: GPL-2.0-only + */ + +#ifndef HW_VIRTIO_IOTHREAD_VQ_MAPPING_H +#define HW_VIRTIO_IOTHREAD_VQ_MAPPING_H + +#include "qapi/error.h" +#include "qapi/qapi-types-virtio.h" + +/** + * iothread_vq_mapping_apply: + * @list: The mapping of virtqueues to IOThreads. + * @vq_aio_context: The array of AioContext pointers to fill in. + * @num_queues: The length of @vq_aio_context. + * @errp: If an error occurs, a pointer to the area to store the error. + * + * Fill in the AioContext for each virtqueue in the @vq_aio_context array given + * the iothread-vq-mapping parameter in @list. + * + * iothread_vq_mapping_cleanup() must be called to free IOThread object + * references after this function returns success. + * + * Returns: %true on success, %false on failure. + **/ +bool iothread_vq_mapping_apply( + IOThreadVirtQueueMappingList *list, + AioContext **vq_aio_context, + uint16_t num_queues, + Error **errp); + +/** + * iothread_vq_mapping_cleanup: + * @list: The mapping of virtqueues to IOThreads. + * + * Release IOThread object references that were acquired by + * iothread_vq_mapping_apply(). + */ +void iothread_vq_mapping_cleanup(IOThreadVirtQueueMappingList *list); + +#endif /* HW_VIRTIO_IOTHREAD_VQ_MAPPING_H */ diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 6bf7b50520..5077793e5e 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -33,6 +33,7 @@ #endif #include "hw/virtio/virtio-bus.h" #include "migration/qemu-file-types.h" +#include "hw/virtio/iothread-vq-mapping.h" #include "hw/virtio/virtio-access.h" #include "hw/virtio/virtio-blk-common.h" #include "qemu/coroutine.h" @@ -1423,147 +1424,6 @@ static const BlockDevOps virtio_block_ops = { .drained_end = virtio_blk_drained_end, }; -static bool -iothread_vq_mapping_validate(IOThreadVirtQueueMappingList *list, uint16_t - num_queues, Error **errp) -{ - g_autofree unsigned long *vqs = bitmap_new(num_queues); - g_autoptr(GHashTable) iothreads = - g_hash_table_new(g_str_hash, g_str_equal); - - for (IOThreadVirtQueueMappingList *node = list; node; node = node->next) { - const char *name = node->value->iothread; - uint16List *vq; - - if (!iothread_by_id(name)) { - error_setg(errp, "IOThread \"%s\" object does not exist", name); - return false; - } - - if (!g_hash_table_add(iothreads, (gpointer)name)) { - error_setg(errp, - "duplicate IOThread name \"%s\" in iothread-vq-mapping", - name); - return false; - } - - if (node != list) { - if (!!node->value->vqs != !!list->value->vqs) { - error_setg(errp, "either all items in iothread-vq-mapping " - "must have vqs or none of them must have it"); - return false; - } - } - - for (vq = node->value->vqs; vq; vq = vq->next) { - if (vq->value >= num_queues) { - error_setg(errp, "vq index %u for IOThread \"%s\" must be " - "less than num_queues %u in iothread-vq-mapping", - vq->value, name, num_queues); - return false; - } - - if (test_and_set_bit(vq->value, vqs)) { - error_setg(errp, "cannot assign vq %u to IOThread \"%s\" " - "because it is already assigned", vq->value, name); - return false; - } - } - } - - if (list->value->vqs) { - for (uint16_t i = 0; i < num_queues; i++) { - if (!test_bit(i, vqs)) { - error_setg(errp, - "missing vq %u IOThread assignment in iothread-vq-mapping", - i); - return false; - } - } - } - - return true; -} - -/** - * iothread_vq_mapping_apply: - * @list: The mapping of virtqueues to IOThreads. - * @vq_aio_context: The array of AioContext pointers to fill in. - * @num_queues: The length of @vq_aio_context. - * @errp: If an error occurs, a pointer to the area to store the error. - * - * Fill in the AioContext for each virtqueue in the @vq_aio_context array given - * the iothread-vq-mapping parameter in @list. - * - * iothread_vq_mapping_cleanup() must be called to free IOThread object - * references after this function returns success. - * - * Returns: %true on success, %false on failure. - **/ -static bool iothread_vq_mapping_apply( - IOThreadVirtQueueMappingList *list, - AioContext **vq_aio_context, - uint16_t num_queues, - Error **errp) -{ - IOThreadVirtQueueMappingList *node; - size_t num_iothreads = 0; - size_t cur_iothread = 0; - - if (!iothread_vq_mapping_validate(list, num_queues, errp)) { - return false; - } - - for (node = list; node; node = node->next) { - num_iothreads++; - } - - for (node = list; node; node = node->next) { - IOThread *iothread = iothread_by_id(node->value->iothread); - AioContext *ctx = iothread_get_aio_context(iothread); - - /* Released in virtio_blk_vq_aio_context_cleanup() */ - object_ref(OBJECT(iothread)); - - if (node->value->vqs) { - uint16List *vq; - - /* Explicit vq:IOThread assignment */ - for (vq = node->value->vqs; vq; vq = vq->next) { - assert(vq->value < num_queues); - vq_aio_context[vq->value] = ctx; - } - } else { - /* Round-robin vq:IOThread assignment */ - for (unsigned i = cur_iothread; i < num_queues; - i += num_iothreads) { - vq_aio_context[i] = ctx; - } - } - - cur_iothread++; - } - - return true; -} - -/** - * iothread_vq_mapping_cleanup: - * @list: The mapping of virtqueues to IOThreads. - * - * Release IOThread object references that were acquired by - * iothread_vq_mapping_apply(). - */ -static void iothread_vq_mapping_cleanup(IOThreadVirtQueueMappingList *list) -{ - IOThreadVirtQueueMappingList *node; - - for (node = list; node; node = node->next) { - IOThread *iothread = iothread_by_id(node->value->iothread); - object_unref(OBJECT(iothread)); - } -} - /* Context: BQL held */ static bool virtio_blk_vq_aio_context_init(VirtIOBlock *s, Error **errp) { diff --git a/hw/virtio/iothread-vq-mapping.c b/hw/virtio/iothread-vq-mapping.c new file mode 100644 index 0000000000..15909eb933 --- /dev/null +++ b/hw/virtio/iothread-vq-mapping.c @@ -0,0 +1,131 @@ +/* + * IOThread Virtqueue Mapping + * + * Copyright Red Hat, Inc + * + * SPDX-License-Identifier: GPL-2.0-only + */ + +#include "qemu/osdep.h" +#include "system/iothread.h" +#include "hw/virtio/iothread-vq-mapping.h" + +static bool +iothread_vq_mapping_validate(IOThreadVirtQueueMappingList *list, uint16_t + num_queues, Error **errp) +{ + g_autofree unsigned long *vqs = bitmap_new(num_queues); + g_autoptr(GHashTable) iothreads = + g_hash_table_new(g_str_hash, g_str_equal); + + for (IOThreadVirtQueueMappingList *node = list; node; node = node->next) { + const char *name = node->value->iothread; + uint16List *vq; + + if (!iothread_by_id(name)) { + error_setg(errp, "IOThread \"%s\" object does not exist", name); + return false; + } + + if (!g_hash_table_add(iothreads, (gpointer)name)) { + error_setg(errp, + "duplicate IOThread name \"%s\" in iothread-vq-mapping", + name); + return false; + } + + if (node != list) { + if (!!node->value->vqs != !!list->value->vqs) { + error_setg(errp, "either all items in iothread-vq-mapping " + "must have vqs or none of them must have it"); + return false; + } + } + + for (vq = node->value->vqs; vq; vq = vq->next) { + if (vq->value >= num_queues) { + error_setg(errp, "vq index %u for IOThread \"%s\" must be " + "less than num_queues %u in iothread-vq-mapping", + vq->value, name, num_queues); + return false; + } + + if (test_and_set_bit(vq->value, vqs)) { + error_setg(errp, "cannot assign vq %u to IOThread \"%s\" " + "because it is already assigned", vq->value, name); + return false; + } + } + } + + if (list->value->vqs) { + for (uint16_t i = 0; i < num_queues; i++) { + if (!test_bit(i, vqs)) { + error_setg(errp, + "missing vq %u IOThread assignment in iothread-vq-mapping", + i); + return false; + } + } + } + + return true; +} + +bool iothread_vq_mapping_apply( + IOThreadVirtQueueMappingList *list, + AioContext **vq_aio_context, + uint16_t num_queues, + Error **errp) +{ + IOThreadVirtQueueMappingList *node; + size_t num_iothreads = 0; + size_t cur_iothread = 0; + + if (!iothread_vq_mapping_validate(list, num_queues, errp)) { + return false; + } + + for (node = list; node; node = node->next) { + num_iothreads++; + } + + for (node = list; node; node = node->next) { + IOThread *iothread = iothread_by_id(node->value->iothread); + AioContext *ctx = iothread_get_aio_context(iothread); + + /* Released in virtio_blk_vq_aio_context_cleanup() */ + object_ref(OBJECT(iothread)); + + if (node->value->vqs) { + uint16List *vq; + + /* Explicit vq:IOThread assignment */ + for (vq = node->value->vqs; vq; vq = vq->next) { + assert(vq->value < num_queues); + vq_aio_context[vq->value] = ctx; + } + } else { + /* Round-robin vq:IOThread assignment */ + for (unsigned i = cur_iothread; i < num_queues; + i += num_iothreads) { + vq_aio_context[i] = ctx; + } + } + + cur_iothread++; + } + + return true; +} + +void iothread_vq_mapping_cleanup(IOThreadVirtQueueMappingList *list) +{ + IOThreadVirtQueueMappingList *node; + + for (node = list; node; node = node->next) { + IOThread *iothread = iothread_by_id(node->value->iothread); + object_unref(OBJECT(iothread)); + } +} + diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build index a5f9f7999d..19b04c4d9c 100644 --- a/hw/virtio/meson.build +++ b/hw/virtio/meson.build @@ -1,5 +1,6 @@ system_virtio_ss = ss.source_set() system_virtio_ss.add(files('virtio-bus.c')) +system_virtio_ss.add(files('iothread-vq-mapping.c')) system_virtio_ss.add(when: 'CONFIG_VIRTIO_PCI', if_true: files('virtio-pci.c')) system_virtio_ss.add(when: 'CONFIG_VIRTIO_MMIO', if_true: files('virtio-mmio.c')) system_virtio_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('virtio-crypto.c')) From patchwork Tue Mar 11 16:00:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012279 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2D90CC2BA1B for ; Tue, 11 Mar 2025 16:16:24 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2Bv-0006if-Ui; Tue, 11 Mar 2025 12:10:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22p-0004X2-93 for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22h-0005oc-SO for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708871; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AQ0xCHnasOhoHWvSQCCdtyFzo2H0y014rFFzhhMEOKc=; b=dHNh6RWdbz9JcS9hvuoI0+TYvJV0uL2wx3HDBQU9E8KzZkxzCHILhpViUXr+BasxfjUfoI bosE+q6wI/XfUK26nxYAgsNyGKB/bRIxV91nb4qeVucHFCxYe65lrKD+ZAHelek6hX/Lak All4rweuGmW5FsLs+l1Jb94KPLPjOUE= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-206-De5gOT-HMGS202r4gvqS2A-1; Tue, 11 Mar 2025 12:01:09 -0400 X-MC-Unique: De5gOT-HMGS202r4gvqS2A-1 X-Mimecast-MFC-AGG-ID: De5gOT-HMGS202r4gvqS2A_1741708868 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7057A195608F; Tue, 11 Mar 2025 16:01:08 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A100718001E9; Tue, 11 Mar 2025 16:01:06 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 20/22] virtio-scsi: add iothread-vq-mapping parameter Date: Tue, 11 Mar 2025 17:00:19 +0100 Message-ID: <20250311160021.349761-21-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi Allow virtio-scsi virtqueues to be assigned to different IOThreads. This makes it possible to take advantage of host multi-queue block layer scalability by assigning virtqueues that have affinity with vCPUs to different IOThreads that have affinity with host CPUs. The same feature was introduced for virtio-blk in the past: https://developers.redhat.com/articles/2024/09/05/scaling-virtio-blk-disk-io-iothread-virtqueue-mapping Here are fio randread 4k iodepth=64 results from a 4 vCPU guest with an Intel P4800X SSD: iothreads IOPS ------------------------------ 1 189576 2 312698 4 346744 Signed-off-by: Stefan Hajnoczi Message-ID: <20250311132616.1049687-12-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/hw/virtio/virtio-scsi.h | 5 +- hw/scsi/virtio-scsi-dataplane.c | 90 ++++++++++++++++++++++++--------- hw/scsi/virtio-scsi.c | 63 ++++++++++++++--------- 3 files changed, 107 insertions(+), 51 deletions(-) diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index 7b7e3ced7a..086201efa2 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -22,6 +22,7 @@ #include "hw/virtio/virtio.h" #include "hw/scsi/scsi.h" #include "chardev/char-fe.h" +#include "qapi/qapi-types-virtio.h" #include "system/iothread.h" #define TYPE_VIRTIO_SCSI_COMMON "virtio-scsi-common" @@ -60,6 +61,7 @@ struct VirtIOSCSIConf { CharBackend chardev; uint32_t boot_tpgt; IOThread *iothread; + IOThreadVirtQueueMappingList *iothread_vq_mapping_list; }; struct VirtIOSCSI; @@ -97,7 +99,7 @@ struct VirtIOSCSI { QTAILQ_HEAD(, VirtIOSCSIReq) tmf_bh_list; /* Fields for dataplane below */ - AioContext *ctx; /* one iothread per virtio-scsi-pci for now */ + AioContext **vq_aio_context; /* per-virtqueue AioContext pointer */ bool dataplane_started; bool dataplane_starting; @@ -115,6 +117,7 @@ void virtio_scsi_common_realize(DeviceState *dev, void virtio_scsi_common_unrealize(DeviceState *dev); void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error **errp); +void virtio_scsi_dataplane_cleanup(VirtIOSCSI *s); int virtio_scsi_dataplane_start(VirtIODevice *s); void virtio_scsi_dataplane_stop(VirtIODevice *s); diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index f49ab98ecc..6bb368c8a5 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -18,6 +18,7 @@ #include "system/block-backend.h" #include "hw/scsi/scsi.h" #include "scsi/constants.h" +#include "hw/virtio/iothread-vq-mapping.h" #include "hw/virtio/virtio-bus.h" /* Context: BQL held */ @@ -27,8 +28,16 @@ void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(s); BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); + uint16_t num_vqs = vs->conf.num_queues + VIRTIO_SCSI_VQ_NUM_FIXED; - if (vs->conf.iothread) { + if (vs->conf.iothread && vs->conf.iothread_vq_mapping_list) { + error_setg(errp, + "iothread and iothread-vq-mapping properties cannot be set " + "at the same time"); + return; + } + + if (vs->conf.iothread || vs->conf.iothread_vq_mapping_list) { if (!k->set_guest_notifiers || !k->ioeventfd_assign) { error_setg(errp, "device is incompatible with iothread " @@ -39,13 +48,48 @@ void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error **errp) error_setg(errp, "ioeventfd is required for iothread"); return; } - s->ctx = iothread_get_aio_context(vs->conf.iothread); - } else { - if (!virtio_device_ioeventfd_enabled(vdev)) { + } + + s->vq_aio_context = g_new(AioContext *, num_vqs); + + if (vs->conf.iothread_vq_mapping_list) { + if (!iothread_vq_mapping_apply(vs->conf.iothread_vq_mapping_list, + s->vq_aio_context, num_vqs, errp)) { + g_free(s->vq_aio_context); + s->vq_aio_context = NULL; return; } - s->ctx = qemu_get_aio_context(); + } else if (vs->conf.iothread) { + AioContext *ctx = iothread_get_aio_context(vs->conf.iothread); + for (uint16_t i = 0; i < num_vqs; i++) { + s->vq_aio_context[i] = ctx; + } + + /* Released in virtio_scsi_dataplane_cleanup() */ + object_ref(OBJECT(vs->conf.iothread)); + } else { + AioContext *ctx = qemu_get_aio_context(); + for (unsigned i = 0; i < num_vqs; i++) { + s->vq_aio_context[i] = ctx; + } + } +} + +/* Context: BQL held */ +void virtio_scsi_dataplane_cleanup(VirtIOSCSI *s) +{ + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s); + + if (vs->conf.iothread_vq_mapping_list) { + iothread_vq_mapping_cleanup(vs->conf.iothread_vq_mapping_list); } + + if (vs->conf.iothread) { + object_unref(OBJECT(vs->conf.iothread)); + } + + g_free(s->vq_aio_context); + s->vq_aio_context = NULL; } static int virtio_scsi_set_host_notifier(VirtIOSCSI *s, VirtQueue *vq, int n) @@ -66,31 +110,20 @@ static int virtio_scsi_set_host_notifier(VirtIOSCSI *s, VirtQueue *vq, int n) } /* Context: BH in IOThread */ -static void virtio_scsi_dataplane_stop_bh(void *opaque) +static void virtio_scsi_dataplane_stop_vq_bh(void *opaque) { - VirtIOSCSI *s = opaque; - VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s); + AioContext *ctx = qemu_get_current_aio_context(); + VirtQueue *vq = opaque; EventNotifier *host_notifier; - int i; - virtio_queue_aio_detach_host_notifier(vs->ctrl_vq, s->ctx); - host_notifier = virtio_queue_get_host_notifier(vs->ctrl_vq); + virtio_queue_aio_detach_host_notifier(vq, ctx); + host_notifier = virtio_queue_get_host_notifier(vq); /* * Test and clear notifier after disabling event, in case poll callback * didn't have time to run. */ virtio_queue_host_notifier_read(host_notifier); - - virtio_queue_aio_detach_host_notifier(vs->event_vq, s->ctx); - host_notifier = virtio_queue_get_host_notifier(vs->event_vq); - virtio_queue_host_notifier_read(host_notifier); - - for (i = 0; i < vs->conf.num_queues; i++) { - virtio_queue_aio_detach_host_notifier(vs->cmd_vqs[i], s->ctx); - host_notifier = virtio_queue_get_host_notifier(vs->cmd_vqs[i]); - virtio_queue_host_notifier_read(host_notifier); - } } /* Context: BQL held */ @@ -154,11 +187,14 @@ int virtio_scsi_dataplane_start(VirtIODevice *vdev) smp_wmb(); /* paired with aio_notify_accept() */ if (s->bus.drain_count == 0) { - virtio_queue_aio_attach_host_notifier(vs->ctrl_vq, s->ctx); - virtio_queue_aio_attach_host_notifier_no_poll(vs->event_vq, s->ctx); + virtio_queue_aio_attach_host_notifier(vs->ctrl_vq, + s->vq_aio_context[0]); + virtio_queue_aio_attach_host_notifier_no_poll(vs->event_vq, + s->vq_aio_context[1]); for (i = 0; i < vs->conf.num_queues; i++) { - virtio_queue_aio_attach_host_notifier(vs->cmd_vqs[i], s->ctx); + AioContext *ctx = s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED + i]; + virtio_queue_aio_attach_host_notifier(vs->cmd_vqs[i], ctx); } } return 0; @@ -207,7 +243,11 @@ void virtio_scsi_dataplane_stop(VirtIODevice *vdev) s->dataplane_stopping = true; if (s->bus.drain_count == 0) { - aio_wait_bh_oneshot(s->ctx, virtio_scsi_dataplane_stop_bh, s); + for (i = 0; i < vs->conf.num_queues + VIRTIO_SCSI_VQ_NUM_FIXED; i++) { + VirtQueue *vq = virtio_get_queue(&vs->parent_obj, i); + AioContext *ctx = s->vq_aio_context[i]; + aio_wait_bh_oneshot(ctx, virtio_scsi_dataplane_stop_vq_bh, vq); + } } blk_drain_all(); /* ensure there are no in-flight requests */ diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 2045d27289..9f61eb97db 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -27,6 +27,7 @@ #include "hw/qdev-properties.h" #include "hw/scsi/scsi.h" #include "scsi/constants.h" +#include "hw/virtio/iothread-vq-mapping.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" #include "trace.h" @@ -318,13 +319,6 @@ static void virtio_scsi_cancel_notify(Notifier *notifier, void *data) g_free(n); } -static inline void virtio_scsi_ctx_check(VirtIOSCSI *s, SCSIDevice *d) -{ - if (s->dataplane_started && d && blk_is_available(d->conf.blk)) { - assert(blk_get_aio_context(d->conf.blk) == s->ctx); - } -} - static void virtio_scsi_do_one_tmf_bh(VirtIOSCSIReq *req) { VirtIOSCSI *s = req->dev; @@ -517,9 +511,11 @@ static void virtio_scsi_flush_defer_tmf_to_aio_context(VirtIOSCSI *s) assert(!s->dataplane_started); - if (s->ctx) { + for (uint32_t i = 0; i < s->parent_obj.conf.num_queues; i++) { + AioContext *ctx = s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED + i]; + /* Our BH only runs after previously scheduled BHs */ - aio_wait_bh_oneshot(s->ctx, dummy_bh, NULL); + aio_wait_bh_oneshot(ctx, dummy_bh, NULL); } } @@ -575,7 +571,6 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req) AioContext *ctx; int ret = 0; - virtio_scsi_ctx_check(s, d); /* Here VIRTIO_SCSI_S_OK means "FUNCTION COMPLETE". */ req->resp.tmf.response = VIRTIO_SCSI_S_OK; @@ -639,6 +634,8 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req) case VIRTIO_SCSI_T_TMF_ABORT_TASK_SET: case VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET: { + g_autoptr(GHashTable) aio_contexts = g_hash_table_new(NULL, NULL); + if (!d) { goto fail; } @@ -648,8 +645,15 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req) qatomic_inc(&req->remaining); - ctx = s->ctx ?: qemu_get_aio_context(); - virtio_scsi_defer_tmf_to_aio_context(req, ctx); + for (uint32_t i = 0; i < s->parent_obj.conf.num_queues; i++) { + ctx = s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED + i]; + + if (!g_hash_table_add(aio_contexts, ctx)) { + continue; /* skip previously added AioContext */ + } + + virtio_scsi_defer_tmf_to_aio_context(req, ctx); + } virtio_scsi_tmf_dec_remaining(req); ret = -EINPROGRESS; @@ -770,9 +774,12 @@ static void virtio_scsi_handle_ctrl_vq(VirtIOSCSI *s, VirtQueue *vq) */ static bool virtio_scsi_defer_to_dataplane(VirtIOSCSI *s) { - if (!s->ctx || s->dataplane_started) { + if (s->dataplane_started) { return false; } + if (s->vq_aio_context[0] == qemu_get_aio_context()) { + return false; /* not using IOThreads */ + } virtio_device_start_ioeventfd(&s->parent_obj.parent_obj); return !s->dataplane_fenced; @@ -946,7 +953,6 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req) virtio_scsi_complete_cmd_req(req); return -ENOENT; } - virtio_scsi_ctx_check(s, d); req->sreq = scsi_req_new(d, req->req.cmd.tag, virtio_scsi_get_lun(req->req.cmd.lun), req->req.cmd.cdb, vs->cdb_size, req); @@ -1218,14 +1224,16 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev, { VirtIODevice *vdev = VIRTIO_DEVICE(hotplug_dev); VirtIOSCSI *s = VIRTIO_SCSI(vdev); + AioContext *ctx = s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED]; SCSIDevice *sd = SCSI_DEVICE(dev); - int ret; - if (s->ctx && !s->dataplane_fenced) { - ret = blk_set_aio_context(sd->conf.blk, s->ctx, errp); - if (ret < 0) { - return; - } + if (ctx != qemu_get_aio_context() && !s->dataplane_fenced) { + /* + * Try to make the BlockBackend's AioContext match ours. Ignore failure + * because I/O will still work although block jobs and other users + * might be slower when multiple AioContexts use a BlockBackend. + */ + blk_set_aio_context(sd->conf.blk, ctx, errp); } if (virtio_vdev_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) { @@ -1260,7 +1268,7 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev, qdev_simple_device_unplug_cb(hotplug_dev, dev, errp); - if (s->ctx) { + if (s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED] != qemu_get_aio_context()) { /* If other users keep the BlockBackend in the iothread, that's ok */ blk_set_aio_context(sd->conf.blk, qemu_get_aio_context(), NULL); } @@ -1294,7 +1302,7 @@ static void virtio_scsi_drained_begin(SCSIBus *bus) for (uint32_t i = 0; i < total_queues; i++) { VirtQueue *vq = virtio_get_queue(vdev, i); - virtio_queue_aio_detach_host_notifier(vq, s->ctx); + virtio_queue_aio_detach_host_notifier(vq, s->vq_aio_context[i]); } } @@ -1320,10 +1328,12 @@ static void virtio_scsi_drained_end(SCSIBus *bus) for (uint32_t i = 0; i < total_queues; i++) { VirtQueue *vq = virtio_get_queue(vdev, i); + AioContext *ctx = s->vq_aio_context[i]; + if (vq == vs->event_vq) { - virtio_queue_aio_attach_host_notifier_no_poll(vq, s->ctx); + virtio_queue_aio_attach_host_notifier_no_poll(vq, ctx); } else { - virtio_queue_aio_attach_host_notifier(vq, s->ctx); + virtio_queue_aio_attach_host_notifier(vq, ctx); } } } @@ -1430,12 +1440,13 @@ void virtio_scsi_common_unrealize(DeviceState *dev) virtio_cleanup(vdev); } +/* main loop */ static void virtio_scsi_device_unrealize(DeviceState *dev) { VirtIOSCSI *s = VIRTIO_SCSI(dev); virtio_scsi_reset_tmf_bh(s); - + virtio_scsi_dataplane_cleanup(s); qbus_set_hotplug_handler(BUS(&s->bus), NULL); virtio_scsi_common_unrealize(dev); qemu_mutex_destroy(&s->tmf_bh_lock); @@ -1460,6 +1471,8 @@ static const Property virtio_scsi_properties[] = { VIRTIO_SCSI_F_CHANGE, true), DEFINE_PROP_LINK("iothread", VirtIOSCSI, parent_obj.conf.iothread, TYPE_IOTHREAD, IOThread *), + DEFINE_PROP_IOTHREAD_VQ_MAPPING_LIST("iothread-vq-mapping", VirtIOSCSI, + parent_obj.conf.iothread_vq_mapping_list), }; static const VMStateDescription vmstate_virtio_scsi = { From patchwork Tue Mar 11 16:00:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012268 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AD577C282EC for ; Tue, 11 Mar 2025 16:14:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2Bc-00061E-Nr; Tue, 11 Mar 2025 12:10:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22p-0004Wy-8j for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22h-0005pO-LK for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708875; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OeBbLzT2TofkNhRYH5+kh6JTgvIl0D7nG0iqsPQZ9bo=; b=Ecs01LawtLF0oU3XhGziwnKDbhp9wD/kKkLhdpYbpOi3FOZeSCKSoY86IaCPUJX44pCn84 PxtagTB3K21bnf6TowgihS+Y2TWME+XioTcdq3pRfXeyTV0QB0Q68O+GWo3EC9wEOGCh/o TIk61ualqIhzH+3tfkumKMWFHe9qiT8= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-657-rYI0a860NvWRnkicR9DsGw-1; Tue, 11 Mar 2025 12:01:11 -0400 X-MC-Unique: rYI0a860NvWRnkicR9DsGw-1 X-Mimecast-MFC-AGG-ID: rYI0a860NvWRnkicR9DsGw_1741708870 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A0C0B180035E; Tue, 11 Mar 2025 16:01:10 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 07DDF1800373; Tue, 11 Mar 2025 16:01:08 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 21/22] virtio-scsi: handle ctrl virtqueue in main loop Date: Tue, 11 Mar 2025 17:00:20 +0100 Message-ID: <20250311160021.349761-22-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi Previously the ctrl virtqueue was handled in the AioContext where SCSI requests are processed. When IOThread Virtqueue Mapping was added things become more complicated because SCSI requests could run in other AioContexts. Simplify by handling the ctrl virtqueue in the main loop where reset operations can be performed. Note that BHs are still used canceling SCSI requests in their AioContexts but at least the mean loop activity doesn't need BHs anymore. Signed-off-by: Stefan Hajnoczi Message-ID: <20250311132616.1049687-13-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- include/hw/virtio/virtio-scsi.h | 8 -- hw/scsi/virtio-scsi-dataplane.c | 6 ++ hw/scsi/virtio-scsi.c | 144 ++++++-------------------------- 3 files changed, 33 insertions(+), 125 deletions(-) diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index 086201efa2..31e852ed6c 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -90,14 +90,6 @@ struct VirtIOSCSI { QemuMutex ctrl_lock; /* protects ctrl_vq */ - /* - * TMFs deferred to main loop BH. These fields are protected by - * tmf_bh_lock. - */ - QemuMutex tmf_bh_lock; - QEMUBH *tmf_bh; - QTAILQ_HEAD(, VirtIOSCSIReq) tmf_bh_list; - /* Fields for dataplane below */ AioContext **vq_aio_context; /* per-virtqueue AioContext pointer */ diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index 6bb368c8a5..2d37fa6712 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -73,6 +73,12 @@ void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error **errp) s->vq_aio_context[i] = ctx; } } + + /* + * Always handle the ctrl virtqueue in the main loop thread where device + * resets can be performed. + */ + s->vq_aio_context[0] = qemu_get_aio_context(); } /* Context: BQL held */ diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 9f61eb97db..d9c32c3245 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -319,115 +319,6 @@ static void virtio_scsi_cancel_notify(Notifier *notifier, void *data) g_free(n); } -static void virtio_scsi_do_one_tmf_bh(VirtIOSCSIReq *req) -{ - VirtIOSCSI *s = req->dev; - SCSIDevice *d = virtio_scsi_device_get(s, req->req.tmf.lun); - BusChild *kid; - int target; - - switch (req->req.tmf.subtype) { - case VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET: - if (!d) { - req->resp.tmf.response = VIRTIO_SCSI_S_BAD_TARGET; - goto out; - } - if (d->lun != virtio_scsi_get_lun(req->req.tmf.lun)) { - req->resp.tmf.response = VIRTIO_SCSI_S_INCORRECT_LUN; - goto out; - } - qatomic_inc(&s->resetting); - device_cold_reset(&d->qdev); - qatomic_dec(&s->resetting); - break; - - case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET: - target = req->req.tmf.lun[1]; - qatomic_inc(&s->resetting); - - rcu_read_lock(); - QTAILQ_FOREACH_RCU(kid, &s->bus.qbus.children, sibling) { - SCSIDevice *d1 = SCSI_DEVICE(kid->child); - if (d1->channel == 0 && d1->id == target) { - device_cold_reset(&d1->qdev); - } - } - rcu_read_unlock(); - - qatomic_dec(&s->resetting); - break; - - default: - g_assert_not_reached(); - } - -out: - object_unref(OBJECT(d)); - virtio_scsi_complete_req(req, &s->ctrl_lock); -} - -/* Some TMFs must be processed from the main loop thread */ -static void virtio_scsi_do_tmf_bh(void *opaque) -{ - VirtIOSCSI *s = opaque; - QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs); - VirtIOSCSIReq *req; - VirtIOSCSIReq *tmp; - - GLOBAL_STATE_CODE(); - - WITH_QEMU_LOCK_GUARD(&s->tmf_bh_lock) { - QTAILQ_FOREACH_SAFE(req, &s->tmf_bh_list, next, tmp) { - QTAILQ_REMOVE(&s->tmf_bh_list, req, next); - QTAILQ_INSERT_TAIL(&reqs, req, next); - } - - qemu_bh_delete(s->tmf_bh); - s->tmf_bh = NULL; - } - - QTAILQ_FOREACH_SAFE(req, &reqs, next, tmp) { - QTAILQ_REMOVE(&reqs, req, next); - virtio_scsi_do_one_tmf_bh(req); - } -} - -static void virtio_scsi_reset_tmf_bh(VirtIOSCSI *s) -{ - VirtIOSCSIReq *req; - VirtIOSCSIReq *tmp; - - GLOBAL_STATE_CODE(); - - /* Called after ioeventfd has been stopped, so tmf_bh_lock is not needed */ - if (s->tmf_bh) { - qemu_bh_delete(s->tmf_bh); - s->tmf_bh = NULL; - } - - QTAILQ_FOREACH_SAFE(req, &s->tmf_bh_list, next, tmp) { - QTAILQ_REMOVE(&s->tmf_bh_list, req, next); - - /* SAM-6 6.3.2 Hard reset */ - req->resp.tmf.response = VIRTIO_SCSI_S_TARGET_FAILURE; - virtio_scsi_complete_req(req, &req->dev->ctrl_lock); - } -} - -static void virtio_scsi_defer_tmf_to_main_loop(VirtIOSCSIReq *req) -{ - VirtIOSCSI *s = req->dev; - - WITH_QEMU_LOCK_GUARD(&s->tmf_bh_lock) { - QTAILQ_INSERT_TAIL(&s->tmf_bh_list, req, next); - - if (!s->tmf_bh) { - s->tmf_bh = qemu_bh_new(virtio_scsi_do_tmf_bh, s); - qemu_bh_schedule(s->tmf_bh); - } - } -} - static void virtio_scsi_tmf_cancel_req(VirtIOSCSIReq *tmf, SCSIRequest *r) { VirtIOSCSICancelNotifier *notifier; @@ -627,11 +518,35 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req) break; case VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET: - case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET: - virtio_scsi_defer_tmf_to_main_loop(req); - ret = -EINPROGRESS; + if (!d) { + goto fail; + } + if (d->lun != virtio_scsi_get_lun(req->req.tmf.lun)) { + goto incorrect_lun; + } + qatomic_inc(&s->resetting); + device_cold_reset(&d->qdev); + qatomic_dec(&s->resetting); break; + case VIRTIO_SCSI_T_TMF_I_T_NEXUS_RESET: { + BusChild *kid; + int target = req->req.tmf.lun[1]; + qatomic_inc(&s->resetting); + + rcu_read_lock(); + QTAILQ_FOREACH_RCU(kid, &s->bus.qbus.children, sibling) { + SCSIDevice *d1 = SCSI_DEVICE(kid->child); + if (d1->channel == 0 && d1->id == target) { + device_cold_reset(&d1->qdev); + } + } + rcu_read_unlock(); + + qatomic_dec(&s->resetting); + break; + } + case VIRTIO_SCSI_T_TMF_ABORT_TASK_SET: case VIRTIO_SCSI_T_TMF_CLEAR_TASK_SET: { g_autoptr(GHashTable) aio_contexts = g_hash_table_new(NULL, NULL); @@ -1087,7 +1002,6 @@ static void virtio_scsi_reset(VirtIODevice *vdev) assert(!s->dataplane_started); - virtio_scsi_reset_tmf_bh(s); virtio_scsi_flush_defer_tmf_to_aio_context(s); qatomic_inc(&s->resetting); @@ -1402,10 +1316,8 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp) VirtIOSCSI *s = VIRTIO_SCSI(dev); Error *err = NULL; - QTAILQ_INIT(&s->tmf_bh_list); qemu_mutex_init(&s->ctrl_lock); qemu_mutex_init(&s->event_lock); - qemu_mutex_init(&s->tmf_bh_lock); virtio_scsi_common_realize(dev, virtio_scsi_handle_ctrl, @@ -1445,11 +1357,9 @@ static void virtio_scsi_device_unrealize(DeviceState *dev) { VirtIOSCSI *s = VIRTIO_SCSI(dev); - virtio_scsi_reset_tmf_bh(s); virtio_scsi_dataplane_cleanup(s); qbus_set_hotplug_handler(BUS(&s->bus), NULL); virtio_scsi_common_unrealize(dev); - qemu_mutex_destroy(&s->tmf_bh_lock); qemu_mutex_destroy(&s->event_lock); qemu_mutex_destroy(&s->ctrl_lock); } From patchwork Tue Mar 11 16:00:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 14012284 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B626C282EC for ; Tue, 11 Mar 2025 16:20:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ts2DF-0000ch-Fw; Tue, 11 Mar 2025 12:12:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts235-0004Ww-3Y for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ts22i-0005ph-8B for qemu-devel@nongnu.org; Tue, 11 Mar 2025 12:01:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741708876; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=erCvDywXkGyZsgBRt9YjSOfw93ZGtqYpmT63Q89bCRc=; b=iJEN3Q3CXp2+zvAdZ3byR+cEb4uT5Sria0xkWr5vMuGbbhcB3Tslz7b3w6Urw9+kOin9YQ YamhWUrHRV06mzXQokVkajTzaI5UR5nlKCVzSBPL8MarxAai6a7niq53aIlhwon2jqlc2+ zENeVFDLqTYQbJuse2/KDtxcnSZ9dSI= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-687-_VdgvvdANJmYWTcZjUYE7A-1; Tue, 11 Mar 2025 12:01:13 -0400 X-MC-Unique: _VdgvvdANJmYWTcZjUYE7A-1 X-Mimecast-MFC-AGG-ID: _VdgvvdANJmYWTcZjUYE7A_1741708872 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B7655180899B; Tue, 11 Mar 2025 16:01:12 +0000 (UTC) Received: from merkur.redhat.com (unknown [10.44.33.18]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2B8F618001E9; Tue, 11 Mar 2025 16:01:10 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org Subject: [PULL 22/22] virtio-scsi: only expose cmd vqs via iothread-vq-mapping Date: Tue, 11 Mar 2025 17:00:21 +0100 Message-ID: <20250311160021.349761-23-kwolf@redhat.com> In-Reply-To: <20250311160021.349761-1-kwolf@redhat.com> References: <20250311160021.349761-1-kwolf@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass client-ip=170.10.129.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Stefan Hajnoczi Peter Krempa and Kevin Wolf observed that iothread-vq-mapping is confusing to use because the control and event virtqueues have a fixed location before the command virtqueues but need to be treated differently. Only expose the command virtqueues via iothread-vq-mapping so that the command-line parameter is intuitive: it controls where SCSI requests are processed. The control virtqueue needs to be hardcoded to the main loop thread for technical reasons anyway. Kevin also pointed out that it's better to place the event virtqueue in the main loop thread since its no poll behavior would prevent polling if assigned to an IOThread. This change is its own commit to avoid squashing the previous commit. Suggested-by: Kevin Wolf Suggested-by: Peter Krempa Signed-off-by: Stefan Hajnoczi Message-ID: <20250311132616.1049687-14-stefanha@redhat.com> Signed-off-by: Kevin Wolf --- hw/scsi/virtio-scsi-dataplane.c | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index 2d37fa6712..95f13fb7c2 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -28,7 +28,6 @@ void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(s); BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); - uint16_t num_vqs = vs->conf.num_queues + VIRTIO_SCSI_VQ_NUM_FIXED; if (vs->conf.iothread && vs->conf.iothread_vq_mapping_list) { error_setg(errp, @@ -50,35 +49,43 @@ void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error **errp) } } - s->vq_aio_context = g_new(AioContext *, num_vqs); + s->vq_aio_context = g_new(AioContext *, vs->conf.num_queues + + VIRTIO_SCSI_VQ_NUM_FIXED); + + /* + * Handle the ctrl virtqueue in the main loop thread where device resets + * can be performed. + */ + s->vq_aio_context[0] = qemu_get_aio_context(); + + /* + * Handle the event virtqueue in the main loop thread where its no_poll + * behavior won't stop IOThread polling. + */ + s->vq_aio_context[1] = qemu_get_aio_context(); if (vs->conf.iothread_vq_mapping_list) { if (!iothread_vq_mapping_apply(vs->conf.iothread_vq_mapping_list, - s->vq_aio_context, num_vqs, errp)) { + &s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED], + vs->conf.num_queues, errp)) { g_free(s->vq_aio_context); s->vq_aio_context = NULL; return; } } else if (vs->conf.iothread) { AioContext *ctx = iothread_get_aio_context(vs->conf.iothread); - for (uint16_t i = 0; i < num_vqs; i++) { - s->vq_aio_context[i] = ctx; + for (uint16_t i = 0; i < vs->conf.num_queues; i++) { + s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED + i] = ctx; } /* Released in virtio_scsi_dataplane_cleanup() */ object_ref(OBJECT(vs->conf.iothread)); } else { AioContext *ctx = qemu_get_aio_context(); - for (unsigned i = 0; i < num_vqs; i++) { - s->vq_aio_context[i] = ctx; + for (unsigned i = 0; i < vs->conf.num_queues; i++) { + s->vq_aio_context[VIRTIO_SCSI_VQ_NUM_FIXED + i] = ctx; } } - - /* - * Always handle the ctrl virtqueue in the main loop thread where device - * resets can be performed. - */ - s->vq_aio_context[0] = qemu_get_aio_context(); } /* Context: BQL held */