From patchwork Sat Feb 3 04:21:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 10198311 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CF677603ED for ; Sat, 3 Feb 2018 04:22:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C36BE28C2E for ; Sat, 3 Feb 2018 04:22:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B807C28C61; Sat, 3 Feb 2018 04:22:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 43AA628C59 for ; Sat, 3 Feb 2018 04:22:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752515AbeBCEWh (ORCPT ); Fri, 2 Feb 2018 23:22:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38812 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751796AbeBCEWh (ORCPT ); Fri, 2 Feb 2018 23:22:37 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1830083F45; Sat, 3 Feb 2018 04:22:37 +0000 (UTC) Received: from localhost (ovpn-12-18.pek2.redhat.com [10.72.12.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id E1C1A691C2; Sat, 3 Feb 2018 04:22:24 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org, Christoph Hellwig , Mike Snitzer Cc: linux-scsi@vger.kernel.org, Hannes Reinecke , Arun Easi , Omar Sandoval , "Martin K . Petersen" , James Bottomley , Christoph Hellwig , Don Brace , Kashyap Desai , Peter Rivera , Paolo Bonzini , Laurence Oberman , Ming Lei Subject: [PATCH 5/5] scsi: virtio_scsi: fix IO hang by irq vector automatic affinity Date: Sat, 3 Feb 2018 12:21:12 +0800 Message-Id: <20180203042112.15239-6-ming.lei@redhat.com> In-Reply-To: <20180203042112.15239-1-ming.lei@redhat.com> References: <20180203042112.15239-1-ming.lei@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Sat, 03 Feb 2018 04:22:37 +0000 (UTC) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now 84676c1f21e8ff5(genirq/affinity: assign vectors to all possible CPUs) has been merged to V4.16-rc, and it is easy to allocate all offline CPUs for some irq vectors, this can't be avoided even though the allocation is improved. For example, on a 8cores VM, 4~7 are not-present/offline, 4 queues of virtio-scsi, the irq affinity assigned can become the following shape: irq 36, cpu list 0-7 irq 37, cpu list 0-7 irq 38, cpu list 0-7 irq 39, cpu list 0-1 irq 40, cpu list 4,6 irq 41, cpu list 2-3 irq 42, cpu list 5,7 Then IO hang is triggered in case of non-SCSI_MQ. Given storage IO is always C/S model, there isn't such issue with SCSI_MQ(blk-mq), because no IO can be submitted to one hw queue if the hw queue hasn't online CPUs. Fix this issue by forcing to use blk_mq. BTW, I have been used virtio-scsi(scsi_mq) for several years, and it has been quite stable, so it shouldn't cause extra risk. Cc: Hannes Reinecke Cc: Arun Easi Cc: Omar Sandoval , Cc: "Martin K. Petersen" , Cc: James Bottomley , Cc: Christoph Hellwig , Cc: Don Brace Cc: Kashyap Desai Cc: Peter Rivera Cc: Paolo Bonzini Cc: Laurence Oberman Cc: Mike Snitzer Signed-off-by: Ming Lei Reviewed-by: Hannes Reinecke --- drivers/scsi/virtio_scsi.c | 59 +++------------------------------------------- 1 file changed, 3 insertions(+), 56 deletions(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 7c28e8d4955a..54e3a0f6844c 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -91,9 +91,6 @@ struct virtio_scsi_vq { struct virtio_scsi_target_state { seqcount_t tgt_seq; - /* Count of outstanding requests. */ - atomic_t reqs; - /* Currently active virtqueue for requests sent to this target. */ struct virtio_scsi_vq *req_vq; }; @@ -152,8 +149,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf) struct virtio_scsi_cmd *cmd = buf; struct scsi_cmnd *sc = cmd->sc; struct virtio_scsi_cmd_resp *resp = &cmd->resp.cmd; - struct virtio_scsi_target_state *tgt = - scsi_target(sc->device)->hostdata; dev_dbg(&sc->device->sdev_gendev, "cmd %p response %u status %#02x sense_len %u\n", @@ -210,8 +205,6 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf) } sc->scsi_done(sc); - - atomic_dec(&tgt->reqs); } static void virtscsi_vq_done(struct virtio_scsi *vscsi, @@ -580,10 +573,7 @@ static int virtscsi_queuecommand_single(struct Scsi_Host *sh, struct scsi_cmnd *sc) { struct virtio_scsi *vscsi = shost_priv(sh); - struct virtio_scsi_target_state *tgt = - scsi_target(sc->device)->hostdata; - atomic_inc(&tgt->reqs); return virtscsi_queuecommand(vscsi, &vscsi->req_vqs[0], sc); } @@ -596,55 +586,11 @@ static struct virtio_scsi_vq *virtscsi_pick_vq_mq(struct virtio_scsi *vscsi, return &vscsi->req_vqs[hwq]; } -static struct virtio_scsi_vq *virtscsi_pick_vq(struct virtio_scsi *vscsi, - struct virtio_scsi_target_state *tgt) -{ - struct virtio_scsi_vq *vq; - unsigned long flags; - u32 queue_num; - - local_irq_save(flags); - if (atomic_inc_return(&tgt->reqs) > 1) { - unsigned long seq; - - do { - seq = read_seqcount_begin(&tgt->tgt_seq); - vq = tgt->req_vq; - } while (read_seqcount_retry(&tgt->tgt_seq, seq)); - } else { - /* no writes can be concurrent because of atomic_t */ - write_seqcount_begin(&tgt->tgt_seq); - - /* keep previous req_vq if a reader just arrived */ - if (unlikely(atomic_read(&tgt->reqs) > 1)) { - vq = tgt->req_vq; - goto unlock; - } - - queue_num = smp_processor_id(); - while (unlikely(queue_num >= vscsi->num_queues)) - queue_num -= vscsi->num_queues; - tgt->req_vq = vq = &vscsi->req_vqs[queue_num]; - unlock: - write_seqcount_end(&tgt->tgt_seq); - } - local_irq_restore(flags); - - return vq; -} - static int virtscsi_queuecommand_multi(struct Scsi_Host *sh, struct scsi_cmnd *sc) { struct virtio_scsi *vscsi = shost_priv(sh); - struct virtio_scsi_target_state *tgt = - scsi_target(sc->device)->hostdata; - struct virtio_scsi_vq *req_vq; - - if (shost_use_blk_mq(sh)) - req_vq = virtscsi_pick_vq_mq(vscsi, sc); - else - req_vq = virtscsi_pick_vq(vscsi, tgt); + struct virtio_scsi_vq *req_vq = virtscsi_pick_vq_mq(vscsi, sc); return virtscsi_queuecommand(vscsi, req_vq, sc); } @@ -775,7 +721,6 @@ static int virtscsi_target_alloc(struct scsi_target *starget) return -ENOMEM; seqcount_init(&tgt->tgt_seq); - atomic_set(&tgt->reqs, 0); tgt->req_vq = &vscsi->req_vqs[0]; starget->hostdata = tgt; @@ -823,6 +768,7 @@ static struct scsi_host_template virtscsi_host_template_single = { .target_alloc = virtscsi_target_alloc, .target_destroy = virtscsi_target_destroy, .track_queue_depth = 1, + .force_blk_mq = 1, }; static struct scsi_host_template virtscsi_host_template_multi = { @@ -844,6 +790,7 @@ static struct scsi_host_template virtscsi_host_template_multi = { .target_destroy = virtscsi_target_destroy, .map_queues = virtscsi_map_queues, .track_queue_depth = 1, + .force_blk_mq = 1, }; #define virtscsi_config_get(vdev, fld) \