From patchwork Tue Feb 8 15:25:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cristian Marussi X-Patchwork-Id: 12738876 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4A408C433F5 for ; Tue, 8 Feb 2022 15:26:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date:Subject:Cc:To :From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=kdXTlYIwCqLcsqZApqUSW9Zpde7iGN8xD2Iu+6LeCO8=; b=RF2bWp177xOEyH MoZ7qlX61inB2pdFhSWMtQ9r2aBoq6vw6HM6VBmyG36GZ9yf4Et8vwofPwqotTDpW6tl1N4xkhDp+ wzSS0rFahRq1n/YXOvgwTikoAAueSTPc3Dc+ICqkZB5tphIRAfW1tTPbUAwLT5cfyk+wAO3dA+lQT CO3eL33uX4z3pib0i4ynCL91jU+U2sJbH9Tx4p1TDtoPbeoPmaXX2q2x1HVwpfoJTgt/00N2jfDOe NkY/4BElJYhUb7GQfb9FwigtW+fDnA/pQBeD1iPLo2kjqJYJdfKrxdjUQmx/Gc8OA9PAxR2PeT3Zp SvspivsRxKyr/PJ91oUA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nHSN6-00ESwT-Qc; Tue, 08 Feb 2022 15:25:36 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nHSN2-00ESvI-Co for linux-arm-kernel@lists.infradead.org; Tue, 08 Feb 2022 15:25:34 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AD0CB2B; Tue, 8 Feb 2022 07:25:30 -0800 (PST) Received: from e120937-lin.home (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4A5223F73B; Tue, 8 Feb 2022 07:25:29 -0800 (PST) From: Cristian Marussi To: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, virtualization@lists.linux-foundation.org Cc: sudeep.holla@arm.com, mst@redhat.com, vincent.guittot@linaro.org, souvik.chakravarty@arm.com, peter.hilber@opensynergy.com, igor.skalkin@opensynergy.com, cristian.marussi@arm.com Subject: [PATCH 0/4] Introduce optional VirtIO ring wrap counters Date: Tue, 8 Feb 2022 15:25:16 +0000 Message-Id: <20220208152520.52983-1-cristian.marussi@arm.com> X-Mailer: git-send-email 2.17.1 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220208_072532_547811_C7243BAA X-CRM114-Status: GOOD ( 18.96 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi all, While using the current VirtIO API to optionally access the virtqueues in polling mode, so as to able to send messages (SCMI commands and replies in my case) in atomic context, it was pointed out in reviews that the current VirtIO API based on virtqueue_poll() is prone to an ABA problem: in fact it is possible, even though extremely unlikely, that, between two polling iterations, the polled index (internal last_used_idx), moving away by exactly 2^16 positions, rolls over into the same apparent exact index position as in the previous polling iteration, so fooling the user into thinking that nothing new has arrived. (while a whopping 2^16 msgs went by) Patch 3/4 intend to mitigate this scenario by embedding a 16 bits wrap counter in the upper 16 bits of the opaque value that is returned by virtqueue_enable_cb_prepare and is, as usual, fed back to virtqueue_poll. This way the ABA problem is indeed mitigated since it can now happen only if a 2^32 'messages-wraparound' happen, which seems better enough. Given that, as of now, the polling behaviour is really not required by any driver but the SCMI VirtIO transport layer, the wrap counting in the core virtio functions is conditional to a use_wrap_counter flag which is false by default and must be explicitly enabled by a driver before virtqueues are made ready (i.e. in the driver probe after the vqs have been discovered). In order to set such vq->use_wrap_counter flag a new API is exported virtqueue_use_wrap_counter(). (given such flag is not really something related to the specification inherently, but it is more of an implementation issue...and sincerely I did not know where else to configure it..) The other patches in the series are related to tools/virtio fixes both related to this series and to pre-existing build errors. A new --wrap-counters option has been added to virtio_test to be able to measure perfs with and without the flag set. While testing on a QEMU-KVM x86 guest I saw a broad range of not-so-stable perf results (to the point that in some run the patched code seemed a lot [1/2 secs] faster than the original o_O, or with the same unpatched original code reporting widely different result in two subsequent runs): here down below you can find the most reliable perf data I was able to gather takeing care to: - pinning QEMU to well defined CPUs - forcing such Host CPUs to use 'performance' governor - pinning guest virtio_test processes to some vCPUs with cpuset (or taskset) - disabling ASLR - disabling console printout ...anyway I still saw a bit of contrasting results here and there, so any suggestion for a better VirtIO perf testbed setup is welcome. Series is based on Linux 5.17-rc2. Thanks, Cristian ---- cset shield --exec -- perf stat --repeat 25 -- nice -n -20 /root/virtio_test_unpatched Performance counter stats for 'nice -n -20 /root/virtio_test_unpatched' (25 runs): 6100.81 msec task-clock # 1.002 CPUs utilized ( +- 0.16% ) 19 context-switches # 3.126 /sec ( +- 2.08% ) 0 cpu-migrations # 0.000 /sec 134 page-faults # 22.049 /sec ( +- 0.03% ) 18249525657 cycles # 3.003 GHz ( +- 0.07% ) 45583397473 instructions # 2.52 insn per cycle ( +- 0.09% ) 14009712668 branches # 2.305 G/sec ( +- 0.09% ) 10075872 branch-misses # 0.07% of all branches ( +- 0.83% ) 6.0908 +- 0.0107 seconds time elapsed ( +- 0.18% ) cset shield --exec -- perf stat --repeat 25 -- nice -n -20 /root/virtio_test_flag_V1 Performance counter stats for 'nice -n -20 /root/virtio_test_flag_V1' (25 runs): 6182.21 msec task-clock # 1.007 CPUs utilized ( +- 0.25% ) 19 context-switches # 3.104 /sec ( +- 1.68% ) 0 cpu-migrations # 0.000 /sec 134 page-faults # 21.889 /sec ( +- 0.03% ) 18142274957 cycles # 2.963 GHz ( +- 0.13% ) 48973010013 instructions # 2.71 insn per cycle ( +- 0.18% ) 15064825126 branches # 2.461 G/sec ( +- 0.18% ) 8697800 branch-misses # 0.06% of all branches ( +- 0.89% ) 6.1382 +- 0.0172 seconds time elapsed ( +- 0.28% ) cset shield --exec -- perf stat --repeat 25 -- nice -n -20 /root/virtio_test_flag_V1 --wrap-counters Performance counter stats for 'nice -n -20 /root/virtio_test_flag_V1 --wrap-counters' (25 runs): 6051.58 msec task-clock # 0.984 CPUs utilized ( +- 0.22% ) 21 context-switches # 3.424 /sec ( +- 1.25% ) 0 cpu-migrations # 0.000 /sec 134 page-faults # 21.846 /sec ( +- 0.03% ) 17928356478 cycles # 2.923 GHz ( +- 0.11% ) 48147192304 instructions # 2.67 insn per cycle ( +- 0.14% ) 14808798588 branches # 2.414 G/sec ( +- 0.15% ) 9108899 branch-misses # 0.06% of all branches ( +- 1.22% ) 6.1525 +- 0.0155 seconds time elapsed ( +- 0.25% ) Cristian Marussi (4): tools/virtio: Miscellaneous build fixes tools/virtio: Add missing spin_lock_init on virtio_test virtio_ring: Embed a wrap counter in opaque poll index value tools/virtio: Add virtqueue_use_wrap_counter support drivers/virtio/virtio_ring.c | 88 ++++++++- include/linux/virtio.h | 2 + tools/virtio/Makefile | 2 +- tools/virtio/linux/bitfield.h | 88 +++++++++ tools/virtio/linux/bits.h | 36 ++++ tools/virtio/linux/mm_types.h | 3 + tools/virtio/linux/module.h | 1 + tools/virtio/linux/moduleparam.h | 7 + tools/virtio/linux/uio.h | 299 ++++++++++++++++++++++++++++++- tools/virtio/linux/virtio.h | 96 ++++++++++ tools/virtio/virtio_test.c | 39 +++- tools/virtio/vringh_test.c | 3 + 12 files changed, 655 insertions(+), 9 deletions(-) create mode 100644 tools/virtio/linux/bitfield.h create mode 100644 tools/virtio/linux/bits.h create mode 100644 tools/virtio/linux/mm_types.h create mode 100644 tools/virtio/linux/moduleparam.h