From patchwork Fri Jan 31 11:24:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Imbrenda X-Patchwork-Id: 13955262 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E6E31E2821; Fri, 31 Jan 2025 11:25:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738322723; cv=none; b=VNzSNDewf3Nq9urMdj4ZGIs5l7JPBYXnbAQj8bOg1g7qOhwNxcWt28Y3jZhZ46BCsTcGzEjlrT3oC2PJMdjRHknI33CiZRE1V4OuofELvUs8Nu15/nMDfacSERZ99GkMlaXxny9nMBG4FDy3XfPlIJ8UXEg6BSPAT5xSiAsTj18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738322723; c=relaxed/simple; bh=bKJAUpLRphd+Hcr/qCpsEWZd0XYy+4eRXlJfCJP+pVc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=u23IIUc6Ucdygu2W451Bjkoeq0q/83F8kTWUmn3QUKUTUAIW2RV2wajWqKnwWqCfLFt3RnRhNqAb6g14dmoMu7+iSidCNupfmcWufFutMg6qBSOEy3ZFMRi7YApt1KLuLMWHRhb9xIYW+FIssCMLKstACaKXVq/Jzd0YSXjAyzA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=fjEnV9nP; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="fjEnV9nP" Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 50V2OTaQ021969; Fri, 31 Jan 2025 11:25:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=BcnCOVqRzirCzflU/ EanKz6V9kgrfshsWJJNnPhsPrk=; b=fjEnV9nP3mmi+fUUoiv62clLC9BGzllbR zvB1nuSuAcdVnAYRC+DQhPd7fPmOPMsw5eHCgsDKz7Ciyvg1YrNviPODE/fIKG7l iG+rdS1eHJ2W3rCqqSBYGf6GUiG5cX28+vNXYWlcg/ArcQySQs7NaNm7JsXYMyMP 4wYf0ZjH89OfK5f1tBGm9dUUzQxerNefmsX9mrJpI2fHJXFM1WN9aYEIe57D1108 pP0JL7Q0LDoW7wX1cwlP0uvVjy6N9E7nCFNKvcj2Za2iIZx/SVUFpx8ZzQeNf7w5 qId0P7MkxYhKBknuzLHuAlB6ZRrN4jR2UL92fLVc+Gf9M+fNGAWXA== Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 44gnby9x00-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 31 Jan 2025 11:25:19 +0000 (GMT) Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 50V8BILi017237; Fri, 31 Jan 2025 11:25:18 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 44gfayb9f2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 31 Jan 2025 11:25:18 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 50VBPF8G55837160 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 31 Jan 2025 11:25:15 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1598720139; Fri, 31 Jan 2025 11:25:15 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C341920133; Fri, 31 Jan 2025 11:25:10 +0000 (GMT) Received: from p-imbrenda.ibmuc.com (unknown [9.171.25.38]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 31 Jan 2025 11:25:10 +0000 (GMT) From: Claudio Imbrenda To: pbonzini@redhat.com Cc: kvm@vger.kernel.org, linux-s390@vger.kernel.org, frankja@linux.ibm.com, borntraeger@de.ibm.com, david@redhat.com Subject: [GIT PULL v2 01/20] KVM: s390: vsie: fix some corner-cases when grabbing vsie pages Date: Fri, 31 Jan 2025 12:24:51 +0100 Message-ID: <20250131112510.48531-2-imbrenda@linux.ibm.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250131112510.48531-1-imbrenda@linux.ibm.com> References: <20250131112510.48531-1-imbrenda@linux.ibm.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: NgBUnH8vimi8lBYlwNb2HhR2-JuMtZGS X-Proofpoint-GUID: NgBUnH8vimi8lBYlwNb2HhR2-JuMtZGS X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1057,Hydra:6.0.680,FMLib:17.12.68.34 definitions=2025-01-31_04,2025-01-31_01,2024-11-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 priorityscore=1501 lowpriorityscore=0 impostorscore=0 mlxscore=0 malwarescore=0 suspectscore=0 phishscore=0 mlxlogscore=999 clxscore=1015 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2501170000 definitions=main-2501310083 From: David Hildenbrand We try to reuse the same vsie page when re-executing the vsie with a given SCB address. The result is that we use the same shadow SCB -- residing in the vsie page -- and can avoid flushing the TLB when re-running the vsie on a CPU. So, when we allocate a fresh vsie page, or when we reuse a vsie page for a different SCB address -- reusing the shadow SCB in different context -- we set ihcpu=0xffff to trigger the flush. However, after we looked up the SCB address in the radix tree, but before we grabbed the vsie page by raising the refcount to 2, someone could reuse the vsie page for a different SCB address, adjusting page->index and the radix tree. In that case, we would be reusing the vsie page with a wrong page->index. Another corner case is that we might set the SCB address for a vsie page, but fail the insertion into the radix tree. Whoever would reuse that page would remove the corresponding radix tree entry -- which might now be a valid entry pointing at another page, resulting in the wrong vsie page getting removed from the radix tree. Let's handle such races better, by validating that the SCB address of a vsie page didn't change after we grabbed it (not reuse for a different SCB; the alternative would be performing another tree lookup), and by setting the SCB address to invalid until the insertion in the tree succeeded (SCB addresses are aligned to 512, so ULONG_MAX is invalid). These scenarios are rare, the effects a bit unclear, and these issues were only found by code inspection. Let's CC stable to be safe. Fixes: a3508fbe9dc6 ("KVM: s390: vsie: initial support for nested virtualization") Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand Reviewed-by: Claudio Imbrenda Reviewed-by: Christoph Schlameuss Tested-by: Christoph Schlameuss Message-ID: <20250107154344.1003072-2-david@redhat.com> Signed-off-by: Claudio Imbrenda --- arch/s390/kvm/vsie.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c index a687695d8f68..513e608567cc 100644 --- a/arch/s390/kvm/vsie.c +++ b/arch/s390/kvm/vsie.c @@ -1362,8 +1362,14 @@ static struct vsie_page *get_vsie_page(struct kvm *kvm, unsigned long addr) page = radix_tree_lookup(&kvm->arch.vsie.addr_to_page, addr >> 9); rcu_read_unlock(); if (page) { - if (page_ref_inc_return(page) == 2) - return page_to_virt(page); + if (page_ref_inc_return(page) == 2) { + if (page->index == addr) + return page_to_virt(page); + /* + * We raced with someone reusing + putting this vsie + * page before we grabbed it. + */ + } page_ref_dec(page); } @@ -1393,15 +1399,20 @@ static struct vsie_page *get_vsie_page(struct kvm *kvm, unsigned long addr) kvm->arch.vsie.next++; kvm->arch.vsie.next %= nr_vcpus; } - radix_tree_delete(&kvm->arch.vsie.addr_to_page, page->index >> 9); + if (page->index != ULONG_MAX) + radix_tree_delete(&kvm->arch.vsie.addr_to_page, + page->index >> 9); } - page->index = addr; - /* double use of the same address */ + /* Mark it as invalid until it resides in the tree. */ + page->index = ULONG_MAX; + + /* Double use of the same address or allocation failure. */ if (radix_tree_insert(&kvm->arch.vsie.addr_to_page, addr >> 9, page)) { page_ref_dec(page); mutex_unlock(&kvm->arch.vsie.mutex); return NULL; } + page->index = addr; mutex_unlock(&kvm->arch.vsie.mutex); vsie_page = page_to_virt(page); @@ -1496,7 +1507,9 @@ void kvm_s390_vsie_destroy(struct kvm *kvm) vsie_page = page_to_virt(page); release_gmap_shadow(vsie_page); /* free the radix tree entry */ - radix_tree_delete(&kvm->arch.vsie.addr_to_page, page->index >> 9); + if (page->index != ULONG_MAX) + radix_tree_delete(&kvm->arch.vsie.addr_to_page, + page->index >> 9); __free_page(page); } kvm->arch.vsie.page_count = 0;