From patchwork Thu Apr 25 09:15:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cornelia Huck X-Patchwork-Id: 10916355 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A914161F for ; Thu, 25 Apr 2019 09:16:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4BF4D28C2B for ; Thu, 25 Apr 2019 09:16:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 400E828C4D; Thu, 25 Apr 2019 09:16:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA35128C42 for ; Thu, 25 Apr 2019 09:16:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727649AbfDYJQU (ORCPT ); Thu, 25 Apr 2019 05:16:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36002 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727768AbfDYJQU (ORCPT ); Thu, 25 Apr 2019 05:16:20 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B78F13082E23; Thu, 25 Apr 2019 09:16:19 +0000 (UTC) Received: from localhost (dhcp-192-187.str.redhat.com [10.33.192.187]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 619565C28D; Thu, 25 Apr 2019 09:16:19 +0000 (UTC) From: Cornelia Huck To: Martin Schwidefsky , Heiko Carstens Cc: Farhan Ali , Eric Farman , Halil Pasic , linux-s390@vger.kernel.org, kvm@vger.kernel.org, Cornelia Huck Subject: [PULL 9/9] vfio-ccw: Prevent quiesce function going into an infinite loop Date: Thu, 25 Apr 2019 11:15:59 +0200 Message-Id: <20190425091559.15490-10-cohuck@redhat.com> In-Reply-To: <20190425091559.15490-1-cohuck@redhat.com> References: <20190425091559.15490-1-cohuck@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Thu, 25 Apr 2019 09:16:19 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Farhan Ali The quiesce function calls cio_cancel_halt_clear() and if we get an -EBUSY we go into a loop where we: - wait for any interrupts - flush all I/O in the workqueue - retry cio_cancel_halt_clear During the period where we are waiting for interrupts or flushing all I/O, the channel subsystem could have completed a halt/clear action and turned off the corresponding activity control bits in the subchannel status word. This means the next time we call cio_cancel_halt_clear(), we will again start by calling cancel subchannel and so we can be stuck between calling cancel and halt forever. Rather than calling cio_cancel_halt_clear() immediately after waiting, let's try to disable the subchannel. If we succeed in disabling the subchannel then we know nothing else can happen with the device. Suggested-by: Eric Farman Signed-off-by: Farhan Ali Message-Id: <4d5a4b98ab1b41ac6131b5c36de18b76c5d66898.1555449329.git.alifm@linux.ibm.com> Reviewed-by: Eric Farman Acked-by: Halil Pasic Signed-off-by: Cornelia Huck --- drivers/s390/cio/vfio_ccw_drv.c | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c index d72aa8c760c5..ee8767f5845a 100644 --- a/drivers/s390/cio/vfio_ccw_drv.c +++ b/drivers/s390/cio/vfio_ccw_drv.c @@ -43,26 +43,30 @@ int vfio_ccw_sch_quiesce(struct subchannel *sch) if (ret != -EBUSY) goto out_unlock; + iretry = 255; do { - iretry = 255; ret = cio_cancel_halt_clear(sch, &iretry); - while (ret == -EBUSY) { - /* - * Flush all I/O and wait for - * cancel/halt/clear completion. - */ - private->completion = &completion; - spin_unlock_irq(sch->lock); - wait_for_completion_timeout(&completion, 3*HZ); + if (ret == -EIO) { + pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n", + sch->schid.ssid, sch->schid.sch_no); + break; + } + + /* + * Flush all I/O and wait for + * cancel/halt/clear completion. + */ + private->completion = &completion; + spin_unlock_irq(sch->lock); - private->completion = NULL; - flush_workqueue(vfio_ccw_work_q); - spin_lock_irq(sch->lock); - ret = cio_cancel_halt_clear(sch, &iretry); - }; + if (ret == -EBUSY) + wait_for_completion_timeout(&completion, 3*HZ); + private->completion = NULL; + flush_workqueue(vfio_ccw_work_q); + spin_lock_irq(sch->lock); ret = cio_disable_subchannel(sch); } while (ret == -EBUSY); out_unlock: