From patchwork Tue Apr 16 21:23:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Farhan Ali X-Patchwork-Id: 10904209 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0074817E0 for ; Tue, 16 Apr 2019 21:23:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E094B28950 for ; Tue, 16 Apr 2019 21:23:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D48412897C; Tue, 16 Apr 2019 21:23:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9937A28950 for ; Tue, 16 Apr 2019 21:23:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728489AbfDPVXX (ORCPT ); Tue, 16 Apr 2019 17:23:23 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:60796 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728277AbfDPVXX (ORCPT ); Tue, 16 Apr 2019 17:23:23 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x3GLE2Oh109312 for ; Tue, 16 Apr 2019 17:23:21 -0400 Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.149]) by mx0b-001b2d01.pphosted.com with ESMTP id 2rwkntfaj2-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 16 Apr 2019 17:23:21 -0400 Received: from localhost by e31.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 16 Apr 2019 22:23:20 +0100 Received: from b03cxnp07029.gho.boulder.ibm.com (9.17.130.16) by e31.co.us.ibm.com (192.168.1.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 16 Apr 2019 22:23:17 +0100 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x3GLNGjw20906140 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 16 Apr 2019 21:23:16 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3C579BE056; Tue, 16 Apr 2019 21:23:16 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 70233BE04F; Tue, 16 Apr 2019 21:23:15 +0000 (GMT) Received: from alifm-ThinkPad-T470p.pok.ibm.com (unknown [9.56.58.129]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTPS; Tue, 16 Apr 2019 21:23:15 +0000 (GMT) From: Farhan Ali To: kvm@vger.kernel.org, linux-s390@vger.kernel.org Cc: cohuck@redhat.com, farman@linux.ibm.com, pasic@linux.ibm.com, pmorel@linux.ibm.com, alifm@linux.ibm.com Subject: [PATCH v3 1/1] vfio-ccw: Prevent quiesce function going into an infinite loop Date: Tue, 16 Apr 2019 17:23:14 -0400 X-Mailer: git-send-email 2.7.4 In-Reply-To: References: X-TM-AS-GCONF: 00 x-cbid: 19041621-8235-0000-0000-00000E810DCE X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010939; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000284; SDB=6.01190065; UDB=6.00623584; IPR=6.00970859; MB=3.00026473; MTD=3.00000008; XFM=3.00000015; UTC=2019-04-16 21:23:19 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19041621-8236-0000-0000-0000452C4715 Message-Id: <4d5a4b98ab1b41ac6131b5c36de18b76c5d66898.1555449329.git.alifm@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-04-16_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=782 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904160129 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The quiesce function calls cio_cancel_halt_clear() and if we get an -EBUSY we go into a loop where we: - wait for any interrupts - flush all I/O in the workqueue - retry cio_cancel_halt_clear During the period where we are waiting for interrupts or flushing all I/O, the channel subsystem could have completed a halt/clear action and turned off the corresponding activity control bits in the subchannel status word. This means the next time we call cio_cancel_halt_clear(), we will again start by calling cancel subchannel and so we can be stuck between calling cancel and halt forever. Rather than calling cio_cancel_halt_clear() immediately after waiting, let's try to disable the subchannel. If we succeed in disabling the subchannel then we know nothing else can happen with the device. Suggested-by: Eric Farman Signed-off-by: Farhan Ali Acked-by: Eric Farman Reviewed-by: Eric Farman Acked-by: Halil Pasic --- ChangeLog: v2 -> v3 - Log an error message when cio_cancel_halt_clear returns EIO and break out of the loop. - Did not include past change log as the other patches of the original series have been queued by Conny. Old series (v2) can be found here: https://marc.info/?l=kvm&m=155475754101769&w=2 drivers/s390/cio/vfio_ccw_drv.c | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c index 78517aa..66a66ac 100644 --- a/drivers/s390/cio/vfio_ccw_drv.c +++ b/drivers/s390/cio/vfio_ccw_drv.c @@ -43,26 +43,30 @@ int vfio_ccw_sch_quiesce(struct subchannel *sch) if (ret != -EBUSY) goto out_unlock; + iretry = 255; do { - iretry = 255; ret = cio_cancel_halt_clear(sch, &iretry); - while (ret == -EBUSY) { - /* - * Flush all I/O and wait for - * cancel/halt/clear completion. - */ - private->completion = &completion; - spin_unlock_irq(sch->lock); - wait_for_completion_timeout(&completion, 3*HZ); + if (ret == -EIO) { + pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n", + sch->schid.ssid, sch->schid.sch_no); + break; + } + + /* + * Flush all I/O and wait for + * cancel/halt/clear completion. + */ + private->completion = &completion; + spin_unlock_irq(sch->lock); - private->completion = NULL; - flush_workqueue(vfio_ccw_work_q); - spin_lock_irq(sch->lock); - ret = cio_cancel_halt_clear(sch, &iretry); - }; + if (ret == -EBUSY) + wait_for_completion_timeout(&completion, 3*HZ); + private->completion = NULL; + flush_workqueue(vfio_ccw_work_q); + spin_lock_irq(sch->lock); ret = cio_disable_subchannel(sch); } while (ret == -EBUSY); out_unlock: