From patchwork Thu May 20 14:13:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12270511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,MIME_BASE64_TEXT, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E397C433B4 for ; Thu, 20 May 2021 14:14:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 17668608FE for ; Thu, 20 May 2021 14:14:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237492AbhETOP5 (ORCPT ); Thu, 20 May 2021 10:15:57 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:32205 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243629AbhETOOz (ORCPT ); Thu, 20 May 2021 10:14:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621520013; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EQLmeFAYlSq9MUn0ggB4grwFyWkzklNm3btxmw1JEAk=; b=aYewR6HJe8vzqcwNxcxLIaC1CXppMWd+wz6ErOs4uDpvv3hMvhayT2YR0w4DJwuNZ3k/YG ay2foquf2tHgc0zeUYpS+rtAx37CjYQim0UORQrYiuhfobygxoG3ZDD1z9h5Qc2Yad21Zl Zi1rXSWXnyNSm6u1Il3inn++HBG7tsQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-398-LqypewjZMVWBZWKdzJlFSw-1; Thu, 20 May 2021 10:13:31 -0400 X-MC-Unique: LqypewjZMVWBZWKdzJlFSw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A646872FFE; Thu, 20 May 2021 14:13:30 +0000 (UTC) Received: from localhost (ovpn-115-223.ams2.redhat.com [10.36.115.223]) by smtp.corp.redhat.com (Postfix) with ESMTP id 71E7719718; Thu, 20 May 2021 14:13:19 +0000 (UTC) From: Stefan Hajnoczi To: virtualization@lists.linux-foundation.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Christoph Hellwig , Jason Wang , Paolo Bonzini , Jens Axboe , slp@redhat.com, sgarzare@redhat.com, "Michael S. Tsirkin" , Stefan Hajnoczi Subject: [PATCH 1/3] virtio: add virtioqueue_more_used() Date: Thu, 20 May 2021 15:13:03 +0100 Message-Id: <20210520141305.355961-2-stefanha@redhat.com> In-Reply-To: <20210520141305.355961-1-stefanha@redhat.com> References: <20210520141305.355961-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Add an API to check whether there are pending used buffers. There is already a similar API called virtqueue_poll() but it only works together with virtqueue_enable_cb_prepare(). The patches that follow add blk-mq ->poll() support to virtio_blk and they need to check for used buffers without re-enabling virtqueue callbacks, so introduce an API for it. Signed-off-by: Stefan Hajnoczi --- include/linux/virtio.h | 2 ++ drivers/virtio/virtio_ring.c | 17 +++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/include/linux/virtio.h b/include/linux/virtio.h index b1894e0323fa..c6ad0f25f412 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -63,6 +63,8 @@ bool virtqueue_kick_prepare(struct virtqueue *vq); bool virtqueue_notify(struct virtqueue *vq); +bool virtqueue_more_used(const struct virtqueue *vq); + void *virtqueue_get_buf(struct virtqueue *vq, unsigned int *len); void *virtqueue_get_buf_ctx(struct virtqueue *vq, unsigned int *len, diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 71e16b53e9c1..7c3da75da462 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -2032,6 +2032,23 @@ static inline bool more_used(const struct vring_virtqueue *vq) return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq); } +/** + * virtqueue_more_used - check if there are used buffers pending + * @_vq: the struct virtqueue we're talking about. + * + * Returns true if there are used buffers, false otherwise. May be called at + * the same time as other virtqueue operations, but actually calling + * virtqueue_get_buf() requires serialization so be mindful of the race between + * calling virtqueue_more_used() and virtqueue_get_buf(). + */ +bool virtqueue_more_used(const struct virtqueue *_vq) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + + return more_used(vq); +} +EXPORT_SYMBOL_GPL(virtqueue_more_used); + irqreturn_t vring_interrupt(int irq, void *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); From patchwork Thu May 20 14:13:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12270513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,MIME_BASE64_TEXT, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8F2FC433B4 for ; Thu, 20 May 2021 14:14:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BDEA1608FE for ; Thu, 20 May 2021 14:14:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242608AbhETOQI (ORCPT ); Thu, 20 May 2021 10:16:08 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20997 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242098AbhETOPJ (ORCPT ); Thu, 20 May 2021 10:15:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621520027; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Agfgj5H+O2SpDpdHb5XPypSRzbYVYTbhLYER/UTUmY4=; b=A1VzRPiXyCPSIF/csRki/y+JB1FAKQKa5i66BXo9gCgSTolWEJwX6W8OMygNWOblWlRXKT 5Lr8LwuX7cTClnxw0oKMygKdQRrs5qT2ZWkkPYPuj0/sshZEPHR0NECwvc7d9w90g7RgFJ 8Yr75Ddk3Gv9NqzjpxuEp75dP7zfQZM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-206-dBeH17zRMVu4AQcmL-Y4Ow-1; Thu, 20 May 2021 10:13:43 -0400 X-MC-Unique: dBeH17zRMVu4AQcmL-Y4Ow-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 29A1E801B14; Thu, 20 May 2021 14:13:42 +0000 (UTC) Received: from localhost (ovpn-115-223.ams2.redhat.com [10.36.115.223]) by smtp.corp.redhat.com (Postfix) with ESMTP id 78DBC1037F22; Thu, 20 May 2021 14:13:31 +0000 (UTC) From: Stefan Hajnoczi To: virtualization@lists.linux-foundation.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Christoph Hellwig , Jason Wang , Paolo Bonzini , Jens Axboe , slp@redhat.com, sgarzare@redhat.com, "Michael S. Tsirkin" , Stefan Hajnoczi Subject: [PATCH 2/3] virtio_blk: avoid repeating vblk->vqs[qid] Date: Thu, 20 May 2021 15:13:04 +0100 Message-Id: <20210520141305.355961-3-stefanha@redhat.com> In-Reply-To: <20210520141305.355961-1-stefanha@redhat.com> References: <20210520141305.355961-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org struct virtio_blk_vq is accessed in many places. Introduce "vbq" local variables to avoid repeating vblk->vqs[qid] throughout the code. The patches that follow will add more accesses, making the payoff even greater. virtio_commit_rqs() calls the local variable "vq", which is easily confused with struct virtqueue. Rename to "vbq" for clarity. Signed-off-by: Stefan Hajnoczi Acked-by: Jason Wang --- drivers/block/virtio_blk.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index b9fa3ef5b57c..fc0fb1dcd399 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -174,16 +174,16 @@ static inline void virtblk_request_done(struct request *req) static void virtblk_done(struct virtqueue *vq) { struct virtio_blk *vblk = vq->vdev->priv; + struct virtio_blk_vq *vbq = &vblk->vqs[vq->index]; bool req_done = false; - int qid = vq->index; struct virtblk_req *vbr; unsigned long flags; unsigned int len; - spin_lock_irqsave(&vblk->vqs[qid].lock, flags); + spin_lock_irqsave(&vbq->lock, flags); do { virtqueue_disable_cb(vq); - while ((vbr = virtqueue_get_buf(vblk->vqs[qid].vq, &len)) != NULL) { + while ((vbr = virtqueue_get_buf(vq, &len)) != NULL) { struct request *req = blk_mq_rq_from_pdu(vbr); if (likely(!blk_should_fake_timeout(req->q))) @@ -197,32 +197,32 @@ static void virtblk_done(struct virtqueue *vq) /* In case queue is stopped waiting for more buffers. */ if (req_done) blk_mq_start_stopped_hw_queues(vblk->disk->queue, true); - spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); + spin_unlock_irqrestore(&vbq->lock, flags); } static void virtio_commit_rqs(struct blk_mq_hw_ctx *hctx) { struct virtio_blk *vblk = hctx->queue->queuedata; - struct virtio_blk_vq *vq = &vblk->vqs[hctx->queue_num]; + struct virtio_blk_vq *vbq = &vblk->vqs[hctx->queue_num]; bool kick; - spin_lock_irq(&vq->lock); - kick = virtqueue_kick_prepare(vq->vq); - spin_unlock_irq(&vq->lock); + spin_lock_irq(&vbq->lock); + kick = virtqueue_kick_prepare(vbq->vq); + spin_unlock_irq(&vbq->lock); if (kick) - virtqueue_notify(vq->vq); + virtqueue_notify(vbq->vq); } static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, const struct blk_mq_queue_data *bd) { struct virtio_blk *vblk = hctx->queue->queuedata; + struct virtio_blk_vq *vbq = &vblk->vqs[hctx->queue_num]; struct request *req = bd->rq; struct virtblk_req *vbr = blk_mq_rq_to_pdu(req); unsigned long flags; unsigned int num; - int qid = hctx->queue_num; int err; bool notify = false; bool unmap = false; @@ -274,16 +274,16 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_IN); } - spin_lock_irqsave(&vblk->vqs[qid].lock, flags); - err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num); + spin_lock_irqsave(&vbq->lock, flags); + err = virtblk_add_req(vbq->vq, vbr, vbr->sg, num); if (err) { - virtqueue_kick(vblk->vqs[qid].vq); + virtqueue_kick(vbq->vq); /* Don't stop the queue if -ENOMEM: we may have failed to * bounce the buffer due to global resource outage. */ if (err == -ENOSPC) blk_mq_stop_hw_queue(hctx); - spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); + spin_unlock_irqrestore(&vbq->lock, flags); switch (err) { case -ENOSPC: return BLK_STS_DEV_RESOURCE; @@ -294,12 +294,12 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, } } - if (bd->last && virtqueue_kick_prepare(vblk->vqs[qid].vq)) + if (bd->last && virtqueue_kick_prepare(vbq->vq)) notify = true; - spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); + spin_unlock_irqrestore(&vbq->lock, flags); if (notify) - virtqueue_notify(vblk->vqs[qid].vq); + virtqueue_notify(vbq->vq); return BLK_STS_OK; } From patchwork Thu May 20 14:13:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12270515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,MIME_BASE64_TEXT, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B11DC433ED for ; Thu, 20 May 2021 14:14:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5E2F76109F for ; Thu, 20 May 2021 14:14:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243690AbhETOQJ (ORCPT ); Thu, 20 May 2021 10:16:09 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:36075 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243659AbhETOPK (ORCPT ); Thu, 20 May 2021 10:15:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621520028; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4aBvqPz1XdTYzGjtiidMRJ9JyKKUPGFlSqJpEy6NtC0=; b=aitiHYesXwo0gQCQn/fAF2K5VzZOT9+TxSS0kazYpMhGu/IuYr/Xykmvpqno3DuXuM+ea0 HOpppP2pb12ys6uifhTRUgtlJM2a5qp3Ay9FZYud3L3DkcItyjeEzy3OMl4ZzHTu1ErmnB N0ThXBr+dCz1y4MRFRfU6bkAfg0wlzw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-394-bnN5ytCLMaSf6Ku-fnacKg-1; Thu, 20 May 2021 10:13:45 -0400 X-MC-Unique: bnN5ytCLMaSf6Ku-fnacKg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E432D107ACE3; Thu, 20 May 2021 14:13:43 +0000 (UTC) Received: from localhost (ovpn-115-223.ams2.redhat.com [10.36.115.223]) by smtp.corp.redhat.com (Postfix) with ESMTP id 770231037F22; Thu, 20 May 2021 14:13:43 +0000 (UTC) From: Stefan Hajnoczi To: virtualization@lists.linux-foundation.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Christoph Hellwig , Jason Wang , Paolo Bonzini , Jens Axboe , slp@redhat.com, sgarzare@redhat.com, "Michael S. Tsirkin" , Stefan Hajnoczi Subject: [PATCH 3/3] virtio_blk: implement blk_mq_ops->poll() Date: Thu, 20 May 2021 15:13:05 +0100 Message-Id: <20210520141305.355961-4-stefanha@redhat.com> In-Reply-To: <20210520141305.355961-1-stefanha@redhat.com> References: <20210520141305.355961-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Request completion latency can be reduced by using polling instead of irqs. Even Posted Interrupts or similar hardware support doesn't beat polling. The reason is that disabling virtqueue notifications saves critical-path CPU cycles on the host by skipping irq injection and in the guest by skipping the irq handler. So let's add blk_mq_ops->poll() support to virtio_blk. The approach taken by this patch differs from the NVMe driver's approach. NVMe dedicates hardware queues to polling and submits REQ_HIPRI requests only on those queues. This patch does not require exclusive polling queues for virtio_blk. Instead, it switches between irqs and polling when one or more REQ_HIPRI requests are in flight on a virtqueue. This is possible because toggling virtqueue notifications is cheap even while the virtqueue is running. NVMe cqs can't do this because irqs are only enabled/disabled at queue creation time. This toggling approach requires no configuration. There is no need to dedicate queues ahead of time or to teach users and orchestration tools how to set up polling queues. Possible drawbacks of this approach: - Hardware virtio_blk implementations may find virtqueue_disable_cb() expensive since it requires DMA. If such devices become popular then the virtio_blk driver could use a similar approach to NVMe when VIRTIO_F_ACCESS_PLATFORM is detected in the future. - If a blk_poll() thread is descheduled it not only hurts polling performance but also delays completion of non-REQ_HIPRI requests on that virtqueue since vq notifications are disabled. Performance: - Benchmark: fio ioengine=pvsync2 numjobs=4 direct=1 - Guest: 4 vCPUs with one virtio-blk device (4 virtqueues) - Disk: Intel Corporation NVMe Datacenter SSD [Optane] [8086:2701] - CPU: Intel(R) Xeon(R) Silver 4214 CPU @ 2.20GHz rw bs hipri=0 hipri=1 ------------------------------ randread 4k 149,426 170,763 +14% randread 16k 118,939 134,269 +12% randread 64k 34,886 34,906 0% randread 128k 17,655 17,667 0% randwrite 4k 138,578 163,600 +18% randwrite 16k 102,089 120,950 +18% randwrite 64k 32,364 32,561 0% randwrite 128k 16,154 16,237 0% read 4k 146,032 170,620 +16% read 16k 117,097 130,437 +11% read 64k 34,834 35,037 0% read 128k 17,680 17,658 0% write 4k 134,562 151,422 +12% write 16k 101,796 107,606 +5% write 64k 32,364 32,594 0% write 128k 16,259 16,265 0% Larger block sizes do not benefit from polling as much but the improvement is worthwhile for smaller block sizes. Signed-off-by: Stefan Hajnoczi --- drivers/block/virtio_blk.c | 92 +++++++++++++++++++++++++++++++++++--- 1 file changed, 87 insertions(+), 5 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index fc0fb1dcd399..f0243dcd745a 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -29,6 +29,16 @@ static struct workqueue_struct *virtblk_wq; struct virtio_blk_vq { struct virtqueue *vq; spinlock_t lock; + + /* Number of non-REQ_HIPRI requests in flight. Protected by lock. */ + unsigned int num_lopri; + + /* Number of REQ_HIPRI requests in flight. Protected by lock. */ + unsigned int num_hipri; + + /* Are vq notifications enabled? Protected by lock. */ + bool cb_enabled; + char name[VQ_NAME_LEN]; } ____cacheline_aligned_in_smp; @@ -171,33 +181,67 @@ static inline void virtblk_request_done(struct request *req) blk_mq_end_request(req, virtblk_result(vbr)); } -static void virtblk_done(struct virtqueue *vq) +/* Returns true if one or more requests completed */ +static bool virtblk_complete_requests(struct virtqueue *vq) { struct virtio_blk *vblk = vq->vdev->priv; struct virtio_blk_vq *vbq = &vblk->vqs[vq->index]; bool req_done = false; + bool last_hipri_done = false; struct virtblk_req *vbr; unsigned long flags; unsigned int len; spin_lock_irqsave(&vbq->lock, flags); + do { - virtqueue_disable_cb(vq); + if (vbq->cb_enabled) + virtqueue_disable_cb(vq); while ((vbr = virtqueue_get_buf(vq, &len)) != NULL) { struct request *req = blk_mq_rq_from_pdu(vbr); + if (req->cmd_flags & REQ_HIPRI) { + if (--vbq->num_hipri == 0) + last_hipri_done = true; + } else + vbq->num_lopri--; + if (likely(!blk_should_fake_timeout(req->q))) blk_mq_complete_request(req); req_done = true; } if (unlikely(virtqueue_is_broken(vq))) break; - } while (!virtqueue_enable_cb(vq)); + + /* Enable vq notifications if non-polled requests remain */ + if (last_hipri_done && vbq->num_lopri > 0) { + last_hipri_done = false; + vbq->cb_enabled = true; + } + } while (vbq->cb_enabled && !virtqueue_enable_cb(vq)); /* In case queue is stopped waiting for more buffers. */ if (req_done) blk_mq_start_stopped_hw_queues(vblk->disk->queue, true); spin_unlock_irqrestore(&vbq->lock, flags); + + return req_done; +} + +static int virtblk_poll(struct blk_mq_hw_ctx *hctx) +{ + struct virtio_blk *vblk = hctx->queue->queuedata; + struct virtqueue *vq = vblk->vqs[hctx->queue_num].vq; + + if (!virtqueue_more_used(vq)) + return 0; + + return virtblk_complete_requests(vq); +} + +static void virtblk_done(struct virtqueue *vq) +{ + virtblk_complete_requests(vq); } static void virtio_commit_rqs(struct blk_mq_hw_ctx *hctx) @@ -275,6 +319,16 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, } spin_lock_irqsave(&vbq->lock, flags); + + /* Re-enable vq notifications if first req is non-polling */ + if (!(req->cmd_flags & REQ_HIPRI) && + vbq->num_lopri == 0 && vbq->num_hipri == 0 && + !vbq->cb_enabled) { + /* Can't return false since there are no in-flight reqs */ + virtqueue_enable_cb(vbq->vq); + vbq->cb_enabled = true; + } + err = virtblk_add_req(vbq->vq, vbr, vbr->sg, num); if (err) { virtqueue_kick(vbq->vq); @@ -294,6 +348,21 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, } } + /* + * Disable vq notifications when polled reqs are submitted. + * + * The virtqueue lock is held so req is still valid here even if the + * device polls the virtqueue and completes the request before we call + * virtqueue_notify(). + */ + if (req->cmd_flags & REQ_HIPRI) { + if (vbq->num_hipri++ == 0 && vbq->cb_enabled) { + virtqueue_disable_cb(vbq->vq); + vbq->cb_enabled = false; + } + } else + vbq->num_lopri++; + if (bd->last && virtqueue_kick_prepare(vbq->vq)) notify = true; spin_unlock_irqrestore(&vbq->lock, flags); @@ -533,6 +602,9 @@ static int init_vq(struct virtio_blk *vblk) for (i = 0; i < num_vqs; i++) { spin_lock_init(&vblk->vqs[i].lock); vblk->vqs[i].vq = vqs[i]; + vblk->vqs[i].num_lopri = 0; + vblk->vqs[i].num_hipri = 0; + vblk->vqs[i].cb_enabled = true; } vblk->num_vqs = num_vqs; @@ -681,8 +753,16 @@ static int virtblk_map_queues(struct blk_mq_tag_set *set) { struct virtio_blk *vblk = set->driver_data; - return blk_mq_virtio_map_queues(&set->map[HCTX_TYPE_DEFAULT], - vblk->vdev, 0); + set->map[HCTX_TYPE_DEFAULT].nr_queues = vblk->num_vqs; + blk_mq_virtio_map_queues(&set->map[HCTX_TYPE_DEFAULT], vblk->vdev, 0); + + set->map[HCTX_TYPE_READ].nr_queues = 0; + + /* HCTX_TYPE_DEFAULT queues are shared with HCTX_TYPE_POLL */ + set->map[HCTX_TYPE_POLL].nr_queues = vblk->num_vqs; + blk_mq_virtio_map_queues(&set->map[HCTX_TYPE_POLL], vblk->vdev, 0); + + return 0; } static const struct blk_mq_ops virtio_mq_ops = { @@ -691,6 +771,7 @@ static const struct blk_mq_ops virtio_mq_ops = { .complete = virtblk_request_done, .init_request = virtblk_init_request, .map_queues = virtblk_map_queues, + .poll = virtblk_poll, }; static unsigned int virtblk_queue_depth; @@ -768,6 +849,7 @@ static int virtblk_probe(struct virtio_device *vdev) memset(&vblk->tag_set, 0, sizeof(vblk->tag_set)); vblk->tag_set.ops = &virtio_mq_ops; + vblk->tag_set.nr_maps = 3; /* default, read, and poll */ vblk->tag_set.queue_depth = queue_depth; vblk->tag_set.numa_node = NUMA_NO_NODE; vblk->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;