From patchwork Mon Sep 19 08:15:08 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Borntraeger X-Patchwork-Id: 9338755 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7409F6022E for ; Mon, 19 Sep 2016 08:15:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 61ED228AA0 for ; Mon, 19 Sep 2016 08:15:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 54C3228B19; Mon, 19 Sep 2016 08:15:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D1F1228AA0 for ; Mon, 19 Sep 2016 08:15:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936546AbcISIPc (ORCPT ); Mon, 19 Sep 2016 04:15:32 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:60798 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S936533AbcISIPQ (ORCPT ); Mon, 19 Sep 2016 04:15:16 -0400 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u8J8AJJc141742 for ; Mon, 19 Sep 2016 04:15:15 -0400 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0b-001b2d01.pphosted.com with ESMTP id 25gyf38vnf-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 19 Sep 2016 04:15:15 -0400 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 19 Sep 2016 09:15:12 +0100 Received: from d06dlp01.portsmouth.uk.ibm.com (9.149.20.13) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 19 Sep 2016 09:15:11 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id 40DD517D805D for ; Mon, 19 Sep 2016 09:17:08 +0100 (BST) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u8J8FAMG36831234 for ; Mon, 19 Sep 2016 08:15:10 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u8J8F97e013832 for ; Mon, 19 Sep 2016 04:15:10 -0400 Received: from oc1450873852.ibm.com (dyn-9-152-224-26.boeblingen.de.ibm.com [9.152.224.26]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u8J8F9cG013775; Mon, 19 Sep 2016 04:15:09 -0400 Subject: Re: [s390] possible deadlock in handle_sigp? To: David Hildenbrand References: <33773797-04ec-413f-7ba2-4bb7a4350a44@de.ibm.com> <20160915212142.5fd5048e@thinkpad-w530> Cc: Paolo Bonzini , KVM list , Cornelia Huck , qemu-devel From: Christian Borntraeger Date: Mon, 19 Sep 2016 10:15:08 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20160915212142.5fd5048e@thinkpad-w530> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16091908-0028-0000-0000-0000021E7F51 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16091908-0029-0000-0000-0000207DC7C8 Message-Id: <18e8d5ed-c6f7-4617-0426-be203beb1965@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-09-19_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609020000 definitions=main-1609190116 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 09/15/2016 09:21 PM, David Hildenbrand wrote: >> On 09/12/2016 08:03 PM, Paolo Bonzini wrote: >>> >>> >>> On 12/09/2016 19:37, Christian Borntraeger wrote: >>>> On 09/12/2016 06:44 PM, Paolo Bonzini wrote: >>>>> I think that two CPUs doing reciprocal SIGPs could in principle end up >>>>> waiting on each other to complete their run_on_cpu. If the SIGP has to >>>>> be synchronous the fix is not trivial (you'd have to put the CPU in a >>>>> state similar to cpu->halted = 1), otherwise it's enough to replace >>>>> run_on_cpu with async_run_on_cpu. >>>> >>>> IIRC the sigps are supossed to be serialized by the big QEMU lock. WIll >>>> have a look. >>> >>> Yes, but run_on_cpu drops it when it waits on the qemu_work_cond >>> condition variable. (Related: I stumbled upon it because I wanted to >>> remove the BQL from run_on_cpu work items). >> >> Yes, seems you are right. If both CPUs have just exited from KVM doing a >> crossover sigp, they will do the arch_exit handling before the run_on_cpu >> stuff which might result in this hang. (luckily it seems quite unlikely >> but still we need to fix it). >> We cannot simply use async as the callbacks also provide the condition >> code for the initiater, so this requires some rework. >> >> > > Smells like having to provide a lock per CPU. Trylock that lock, if that's not > possible, cc=busy. SIGP SET ARCHITECTURE has to lock all CPUs. > > That was the initital design, until I realized that this was all protected by > the BQL. > > David We only do the slow path things in QEMU. Maybe we could just have one lock that we trylock and return a condition code of 2 (busy) if we fail. That seems the most simple solution while still being architecturally correct. Something like --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c index f348745..5706218 100644 --- a/target-s390x/kvm.c +++ b/target-s390x/kvm.c @@ -133,6 +133,8 @@ const KVMCapabilityInfo kvm_arch_required_capabilities[] = { KVM_CAP_LAST_INFO }; +static QemuMutex qemu_sigp_mutex; + static int cap_sync_regs; static int cap_async_pf; static int cap_mem_op; @@ -358,6 +360,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s) rc = compat_disable_facilities(s, fac_mask, ARRAY_SIZE(fac_mask)); } + qemu_mutex_init(&qemu_sigp_mutex); + return rc; } @@ -1845,6 +1849,11 @@ static int handle_sigp(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1) status_reg = &env->regs[r1]; param = (r1 % 2) ? env->regs[r1] : env->regs[r1 + 1]; + if (qemu_mutex_trylock(&qemu_sigp_mutex)) { + setcc(cpu, SIGP_CC_BUSY ); + return 0; + } + switch (order) { case SIGP_SET_ARCH: ret = sigp_set_architecture(cpu, param, status_reg); @@ -1854,6 +1863,7 @@ static int handle_sigp(S390CPU *cpu, struct kvm_run *run, uint8_t ipa1) dst_cpu = s390_cpu_addr2state(env->regs[r3]); ret = handle_sigp_single_dst(dst_cpu, order, param, status_reg); } + qemu_mutex_unlock(&qemu_sigp_mutex); trace_kvm_sigp_finished(order, CPU(cpu)->cpu_index, dst_cpu ? CPU(dst_cpu)->cpu_index : -1, ret);