From patchwork Wed Sep 19 08:47:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605497 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7F10215A6 for ; Wed, 19 Sep 2018 08:49:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6849F2B59F for ; Wed, 19 Sep 2018 08:49:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5BDE12B5AA; Wed, 19 Sep 2018 08:49:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03A752B59F for ; Wed, 19 Sep 2018 08:49:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731096AbeISOZw (ORCPT ); Wed, 19 Sep 2018 10:25:52 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52686 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731074AbeISOZw (ORCPT ); Wed, 19 Sep 2018 10:25:52 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8ijo5019242 for ; Wed, 19 Sep 2018 04:48:58 -0400 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0b-001b2d01.pphosted.com with ESMTP id 2mkgkmxf0q-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:48:57 -0400 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:48:56 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:48:54 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8mrau62128356 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:48:53 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E4C05AE055; Wed, 19 Sep 2018 11:48:00 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 370AEAE056; Wed, 19 Sep 2018 11:48:00 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:00 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 01/14] s390/mm: Code cleanups Date: Wed, 19 Sep 2018 10:47:49 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0008-0000-0000-000002736A16 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0009-0000-0000-000021DBC263 Message-Id: <20180919084802.183381-2-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=948 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Let's clean up leftovers before introducing new code. Signed-off-by: Janosch Frank Reviewed-by: David Hildenbrand Acked-by: Christian Borntraeger --- arch/s390/mm/gmap.c | 8 ++++---- arch/s390/mm/pgtable.c | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 1e668b95e0c6..9ccd62cc7f37 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -5,7 +5,7 @@ * Copyright IBM Corp. 2007, 2016, 2018 * Author(s): Martin Schwidefsky * David Hildenbrand - * Janosch Frank + * Janosch Frank */ #include @@ -2285,10 +2285,10 @@ static void gmap_pmdp_xchg(struct gmap *gmap, pmd_t *pmdp, pmd_t new, pmdp_notify_gmap(gmap, pmdp, gaddr); pmd_val(new) &= ~_SEGMENT_ENTRY_GMAP_IN; if (MACHINE_HAS_TLB_GUEST) - __pmdp_idte(gaddr, (pmd_t *)pmdp, IDTE_GUEST_ASCE, gmap->asce, + __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, gmap->asce, IDTE_GLOBAL); else if (MACHINE_HAS_IDTE) - __pmdp_idte(gaddr, (pmd_t *)pmdp, 0, 0, IDTE_GLOBAL); + __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_GLOBAL); else __pmdp_csp(pmdp); *pmdp = new; @@ -2505,7 +2505,7 @@ static inline void thp_split_mm(struct mm_struct *mm) * - This must be called after THP was enabled */ static int __zap_zero_pages(pmd_t *pmd, unsigned long start, - unsigned long end, struct mm_walk *walk) + unsigned long end, struct mm_walk *walk) { unsigned long addr; diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index f2cc7da473e4..16d35b881a11 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -723,7 +723,7 @@ void ptep_zap_key(struct mm_struct *mm, unsigned long addr, pte_t *ptep) * Test and reset if a guest page is dirty */ bool ptep_test_and_clear_uc(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) + pte_t *ptep) { pgste_t pgste; pte_t pte; From patchwork Wed Sep 19 08:47:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605499 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 34F3D5A4 for ; Wed, 19 Sep 2018 08:49:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2179A2B59F for ; Wed, 19 Sep 2018 08:49:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 157F02B5AA; Wed, 19 Sep 2018 08:49:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 79C402B59F for ; Wed, 19 Sep 2018 08:49:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731099AbeISOZy (ORCPT ); Wed, 19 Sep 2018 10:25:54 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:40948 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730980AbeISOZy (ORCPT ); Wed, 19 Sep 2018 10:25:54 -0400 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iY7H024409 for ; Wed, 19 Sep 2018 04:49:01 -0400 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkgkmxh8n-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:01 -0400 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:48:59 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:48:56 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8mtiY56754248 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:48:55 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CFC3CAE056; Wed, 19 Sep 2018 11:48:02 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 29E42AE045; Wed, 19 Sep 2018 11:48:02 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:02 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 02/14] s390/mm: Improve locking for huge page backings Date: Wed, 19 Sep 2018 10:47:50 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0028-0000-0000-000002FB67A0 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0029-0000-0000-000023B52279 Message-Id: <20180919084802.183381-3-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=472 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The gmap guest_table_lock is used to protect changes to the guest's DAT tables from region 1 to segments. Therefore it also protects the host to guest radix tree where each new segment mapping by gmap_link() is tracked. Changes to ptes are synchronized through the pte lock, which is easyly retrievable, because the gmap shares the page tables with userspace. With huge pages the story changes. PMD tables are not shared and we're left with the pmd lock on userspace side and the guest_table_lock on the gmap side. Having two locks for an object is a guarantee for locking problems. Therefore the guest_table_lock will only be used for population of the gmap tables and hence protecting the host_to_guest tree. While the pmd lock will be used for all changes to the pmd from both userspace and the gmap. This means we need to retrieve the vmaddr to retrieve a gmap pmd, which takes a bit longer than before. But we can now operate on multiple pmds which are in disjoint segment tables instead of having a global lock. Signed-off-by: Janosch Frank --- arch/s390/include/asm/pgtable.h | 1 + arch/s390/mm/gmap.c | 70 +++++++++++++++++++++++++---------------- arch/s390/mm/pgtable.c | 2 +- 3 files changed, 45 insertions(+), 28 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 0e7cb0dc9c33..c0abd57c5a21 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1420,6 +1420,7 @@ static inline void __pudp_idte(unsigned long addr, pud_t *pudp, } } +pmd_t *pmd_alloc_map(struct mm_struct *mm, unsigned long addr); pmd_t pmdp_xchg_direct(struct mm_struct *, unsigned long, pmd_t *, pmd_t); pmd_t pmdp_xchg_lazy(struct mm_struct *, unsigned long, pmd_t *, pmd_t); pud_t pudp_xchg_direct(struct mm_struct *, unsigned long, pud_t *, pud_t); diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 9ccd62cc7f37..04c24a284113 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -895,47 +895,62 @@ static void gmap_pte_op_end(spinlock_t *ptl) } /** - * gmap_pmd_op_walk - walk the gmap tables, get the guest table lock - * and return the pmd pointer + * gmap_pmd_op_walk - walk the gmap tables, get the pmd_lock if needed + * and return the pmd pointer or NULL * @gmap: pointer to guest mapping meta data structure * @gaddr: virtual address in the guest address space * * Returns a pointer to the pmd for a guest address, or NULL */ -static inline pmd_t *gmap_pmd_op_walk(struct gmap *gmap, unsigned long gaddr) +static inline pmd_t *gmap_pmd_op_walk(struct gmap *gmap, unsigned long gaddr, + spinlock_t **ptl) { - pmd_t *pmdp; + pmd_t *pmdp, *hpmdp; + unsigned long vmaddr; + BUG_ON(gmap_is_shadow(gmap)); - pmdp = (pmd_t *) gmap_table_walk(gmap, gaddr, 1); - if (!pmdp) - return NULL; - /* without huge pages, there is no need to take the table lock */ - if (!gmap->mm->context.allow_gmap_hpage_1m) - return pmd_none(*pmdp) ? NULL : pmdp; - - spin_lock(&gmap->guest_table_lock); - if (pmd_none(*pmdp)) { - spin_unlock(&gmap->guest_table_lock); - return NULL; + *ptl = NULL; + if (gmap->mm->context.allow_gmap_hpage_1m) { + vmaddr = __gmap_translate(gmap, gaddr); + if (IS_ERR_VALUE(vmaddr)) + return NULL; + hpmdp = pmd_alloc_map(gmap->mm, vmaddr); + if (!hpmdp) + return NULL; + *ptl = pmd_lock(gmap->mm, hpmdp); + if (pmd_none(*hpmdp)) { + spin_unlock(*ptl); + *ptl = NULL; + return NULL; + } + if (!pmd_large(*hpmdp)) { + spin_unlock(*ptl); + *ptl = NULL; + } + } + + pmdp = (pmd_t *) gmap_table_walk(gmap, gaddr, 1); + if (!pmdp || pmd_none(*pmdp)) { + if (*ptl) + spin_unlock(*ptl); + pmdp = NULL; + *ptl = NULL; } - /* 4k page table entries are locked via the pte (pte_alloc_map_lock). */ - if (!pmd_large(*pmdp)) - spin_unlock(&gmap->guest_table_lock); return pmdp; } /** - * gmap_pmd_op_end - release the guest_table_lock if needed + * gmap_pmd_op_end - release the pmd lock if needed * @gmap: pointer to the guest mapping meta data structure * @pmdp: pointer to the pmd */ -static inline void gmap_pmd_op_end(struct gmap *gmap, pmd_t *pmdp) +static inline void gmap_pmd_op_end(spinlock_t *ptl) { - if (pmd_large(*pmdp)) - spin_unlock(&gmap->guest_table_lock); + if (ptl) + spin_unlock(ptl); } /* @@ -1037,13 +1052,14 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, unsigned long len, int prot, unsigned long bits) { unsigned long vmaddr, dist; + spinlock_t *ptl = NULL; pmd_t *pmdp; int rc; BUG_ON(gmap_is_shadow(gmap)); while (len) { rc = -EAGAIN; - pmdp = gmap_pmd_op_walk(gmap, gaddr); + pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl); if (pmdp) { if (!pmd_large(*pmdp)) { rc = gmap_protect_pte(gmap, gaddr, pmdp, prot, @@ -1061,7 +1077,7 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, gaddr = (gaddr & HPAGE_MASK) + HPAGE_SIZE; } } - gmap_pmd_op_end(gmap, pmdp); + gmap_pmd_op_end(ptl); } if (rc) { if (rc == -EINVAL) @@ -2457,9 +2473,9 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long bitmap[4], int i; pmd_t *pmdp; pte_t *ptep; - spinlock_t *ptl; + spinlock_t *ptl = NULL; - pmdp = gmap_pmd_op_walk(gmap, gaddr); + pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl); if (!pmdp) return; @@ -2476,7 +2492,7 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long bitmap[4], spin_unlock(ptl); } } - gmap_pmd_op_end(gmap, pmdp); + gmap_pmd_op_end(ptl); } EXPORT_SYMBOL_GPL(gmap_sync_dirty_log_pmd); diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 16d35b881a11..4b184744350b 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -410,7 +410,7 @@ static inline pmd_t pmdp_flush_lazy(struct mm_struct *mm, return old; } -static pmd_t *pmd_alloc_map(struct mm_struct *mm, unsigned long addr) +pmd_t *pmd_alloc_map(struct mm_struct *mm, unsigned long addr) { pgd_t *pgd; p4d_t *p4d; From patchwork Wed Sep 19 08:47:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605501 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 377476CB for ; Wed, 19 Sep 2018 08:49:06 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2377C2B59F for ; Wed, 19 Sep 2018 08:49:06 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 17BD92B5AA; Wed, 19 Sep 2018 08:49:06 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B2DD32B59F for ; Wed, 19 Sep 2018 08:49:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731106AbeISOZ5 (ORCPT ); Wed, 19 Sep 2018 10:25:57 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:46654 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731074AbeISOZ4 (ORCPT ); Wed, 19 Sep 2018 10:25:56 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8ixT1105293 for ; Wed, 19 Sep 2018 04:49:03 -0400 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0b-001b2d01.pphosted.com with ESMTP id 2mkgjwph3p-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:02 -0400 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:00 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:48:58 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8mvGC62128366 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:48:57 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E1326AE055; Wed, 19 Sep 2018 11:48:04 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 356A9AE045; Wed, 19 Sep 2018 11:48:04 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:03 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 03/14] s390/mm: Take locking out of gmap_protect_pte Date: Wed, 19 Sep 2018 10:47:51 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0020-0000-0000-000002C8590E X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0021-0000-0000-00002115D298 Message-Id: <20180919084802.183381-4-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=733 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Locking outside of the function gives us the freedom of ordering locks, which will be important to not get locking issues for gmap_protect_rmap. Signed-off-by: Janosch Frank --- arch/s390/mm/gmap.c | 30 ++++++++++++++---------------- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 04c24a284113..795f558c8246 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -1013,25 +1013,15 @@ static int gmap_protect_pmd(struct gmap *gmap, unsigned long gaddr, * Expected to be called with sg->mm->mmap_sem in read */ static int gmap_protect_pte(struct gmap *gmap, unsigned long gaddr, - pmd_t *pmdp, int prot, unsigned long bits) + pte_t *ptep, int prot, unsigned long bits) { int rc; - pte_t *ptep; - spinlock_t *ptl = NULL; unsigned long pbits = 0; - if (pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID) - return -EAGAIN; - - ptep = pte_alloc_map_lock(gmap->mm, pmdp, gaddr, &ptl); - if (!ptep) - return -ENOMEM; - pbits |= (bits & GMAP_NOTIFY_MPROT) ? PGSTE_IN_BIT : 0; pbits |= (bits & GMAP_NOTIFY_SHADOW) ? PGSTE_VSIE_BIT : 0; /* Protect and unlock. */ rc = ptep_force_prot(gmap->mm, gaddr, ptep, prot, pbits); - gmap_pte_op_end(ptl); return rc; } @@ -1052,18 +1042,26 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, unsigned long len, int prot, unsigned long bits) { unsigned long vmaddr, dist; - spinlock_t *ptl = NULL; + spinlock_t *ptl_pmd = NULL, *ptl_pte = NULL; pmd_t *pmdp; + pte_t *ptep; int rc; BUG_ON(gmap_is_shadow(gmap)); while (len) { rc = -EAGAIN; - pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl); + pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl_pmd); if (pmdp) { if (!pmd_large(*pmdp)) { - rc = gmap_protect_pte(gmap, gaddr, pmdp, prot, - bits); + ptl_pte = NULL; + ptep = pte_alloc_map_lock(gmap->mm, pmdp, gaddr, + &ptl_pte); + if (ptep) + rc = gmap_protect_pte(gmap, gaddr, + ptep, prot, bits); + else + rc = -ENOMEM; + gmap_pte_op_end(ptl_pte); if (!rc) { len -= PAGE_SIZE; gaddr += PAGE_SIZE; @@ -1077,7 +1075,7 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, gaddr = (gaddr & HPAGE_MASK) + HPAGE_SIZE; } } - gmap_pmd_op_end(ptl); + gmap_pmd_op_end(ptl_pmd); } if (rc) { if (rc == -EINVAL) From patchwork Wed Sep 19 08:47:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605515 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 957096CB for ; Wed, 19 Sep 2018 08:49:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 816172B59F for ; Wed, 19 Sep 2018 08:49:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 75B672B5AA; Wed, 19 Sep 2018 08:49:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 332C62B59F for ; Wed, 19 Sep 2018 08:49:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731169AbeISO0M (ORCPT ); Wed, 19 Sep 2018 10:26:12 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52232 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726163AbeISO0B (ORCPT ); Wed, 19 Sep 2018 10:26:01 -0400 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iYmn046109 for ; Wed, 19 Sep 2018 04:49:05 -0400 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0b-001b2d01.pphosted.com with ESMTP id 2mkgknxngg-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:05 -0400 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:03 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:00 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8mxTK66912494 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:48:59 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D7731AE04D; Wed, 19 Sep 2018 11:48:06 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1EB6FAE055; Wed, 19 Sep 2018 11:48:06 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:05 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 04/14] s390/mm: split huge pages in GMAP when protecting Date: Wed, 19 Sep 2018 10:47:52 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0020-0000-0000-000002C85910 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0021-0000-0000-00002115D29A Message-Id: <20180919084802.183381-5-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Dirty tracking, vsie protection and lowcore invalidation notification are best done on the smallest page size available to avoid unnecessary flushing and table management operations. Hence we now split huge pages and introduce a page table if a notification bit is set or memory is protected via gmap_protect_range or gmap_protect_rmap. Signed-off-by: Janosch Frank --- arch/s390/include/asm/gmap.h | 18 +++ arch/s390/include/asm/pgtable.h | 3 + arch/s390/mm/gmap.c | 243 +++++++++++++++++++++++++++++++--------- arch/s390/mm/pgtable.c | 33 ++++++ 4 files changed, 247 insertions(+), 50 deletions(-) diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h index fcbd638fb9f4..c667bd0181d4 100644 --- a/arch/s390/include/asm/gmap.h +++ b/arch/s390/include/asm/gmap.h @@ -16,6 +16,11 @@ /* Status bits only for huge segment entries */ #define _SEGMENT_ENTRY_GMAP_IN 0x8000 /* invalidation notify bit */ #define _SEGMENT_ENTRY_GMAP_UC 0x4000 /* dirty (migration) */ +/* Status bits in the gmap segment entry. */ +#define _SEGMENT_ENTRY_GMAP_SPLIT 0x0001 /* split huge pmd */ + +#define GMAP_SEGMENT_STATUS_BITS (_SEGMENT_ENTRY_GMAP_UC | _SEGMENT_ENTRY_GMAP_SPLIT) +#define GMAP_SEGMENT_NOTIFY_BITS _SEGMENT_ENTRY_GMAP_IN /** * struct gmap_struct - guest address space @@ -56,6 +61,8 @@ struct gmap { struct radix_tree_root host_to_rmap; struct list_head children; struct list_head pt_list; + struct list_head split_list; + spinlock_t split_list_lock; spinlock_t shadow_lock; struct gmap *parent; unsigned long orig_asce; @@ -96,6 +103,17 @@ static inline int gmap_is_shadow(struct gmap *gmap) return !!gmap->parent; } +/** + * gmap_pmd_is_split - Returns if a huge gmap pmd has been split. + * @pmdp: pointer to the pmd + * + * Returns true if the passed huge gmap pmd has been split. + */ +static inline bool gmap_pmd_is_split(pmd_t *pmdp) +{ + return !!(pmd_val(*pmdp) & _SEGMENT_ENTRY_GMAP_SPLIT); +} + struct gmap *gmap_create(struct mm_struct *mm, unsigned long limit); void gmap_remove(struct gmap *gmap); struct gmap *gmap_get(struct gmap *gmap); diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index c0abd57c5a21..54d8376b7a10 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1103,6 +1103,9 @@ int ptep_shadow_pte(struct mm_struct *mm, unsigned long saddr, pte_t *sptep, pte_t *tptep, pte_t pte); void ptep_unshadow_pte(struct mm_struct *mm, unsigned long saddr, pte_t *ptep); +unsigned long ptep_get_and_clear_notification_bits(pte_t *ptep); +void ptep_remove_protection_split(struct mm_struct *mm, pte_t *ptep, + unsigned long gaddr); bool ptep_test_and_clear_uc(struct mm_struct *mm, unsigned long address, pte_t *ptep); int set_guest_storage_key(struct mm_struct *mm, unsigned long addr, diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 795f558c8246..8e78a124d31a 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -62,11 +62,13 @@ static struct gmap *gmap_alloc(unsigned long limit) INIT_LIST_HEAD(&gmap->crst_list); INIT_LIST_HEAD(&gmap->children); INIT_LIST_HEAD(&gmap->pt_list); + INIT_LIST_HEAD(&gmap->split_list); INIT_RADIX_TREE(&gmap->guest_to_host, GFP_KERNEL); INIT_RADIX_TREE(&gmap->host_to_guest, GFP_ATOMIC); INIT_RADIX_TREE(&gmap->host_to_rmap, GFP_ATOMIC); spin_lock_init(&gmap->guest_table_lock); spin_lock_init(&gmap->shadow_lock); + spin_lock_init(&gmap->split_list_lock); atomic_set(&gmap->ref_count, 1); page = alloc_pages(GFP_KERNEL, CRST_ALLOC_ORDER); if (!page) @@ -193,6 +195,10 @@ static void gmap_free(struct gmap *gmap) gmap_radix_tree_free(&gmap->guest_to_host); gmap_radix_tree_free(&gmap->host_to_guest); + /* Free split pmd page tables */ + list_for_each_entry_safe(page, next, &gmap->split_list, lru) + page_table_free_pgste(page); + /* Free additional data for a shadow gmap */ if (gmap_is_shadow(gmap)) { /* Free all page tables. */ @@ -547,6 +553,7 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) pud_t *pud; pmd_t *pmd; u64 unprot; + pte_t *ptep; int rc; BUG_ON(gmap_is_shadow(gmap)); @@ -597,9 +604,15 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) rc = radix_tree_preload(GFP_KERNEL); if (rc) return rc; + /* + * do_exception() does remove the pte index for huge + * pages, so we need to re-add it here to work on the + * correct pte. + */ + vmaddr = vmaddr | (gaddr & ~PMD_MASK); ptl = pmd_lock(mm, pmd); - spin_lock(&gmap->guest_table_lock); if (*table == _SEGMENT_ENTRY_EMPTY) { + spin_lock(&gmap->guest_table_lock); rc = radix_tree_insert(&gmap->host_to_guest, vmaddr >> PMD_SHIFT, table); if (!rc) { @@ -611,14 +624,24 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) *table = pmd_val(*pmd) & _SEGMENT_ENTRY_HARDWARE_BITS; } + spin_unlock(&gmap->guest_table_lock); } else if (*table & _SEGMENT_ENTRY_PROTECT && !(pmd_val(*pmd) & _SEGMENT_ENTRY_PROTECT)) { unprot = (u64)*table; unprot &= ~_SEGMENT_ENTRY_PROTECT; unprot |= _SEGMENT_ENTRY_GMAP_UC; gmap_pmdp_xchg(gmap, (pmd_t *)table, __pmd(unprot), gaddr); + } else if (gmap_pmd_is_split((pmd_t *)table)) { + /* + * Split pmds are somewhere in-between a normal and a + * large pmd. As we don't share the page table, the + * host does not remove protection on a fault and we + * have to do it ourselves for the guest mapping. + */ + ptep = pte_offset_map((pmd_t *)table, vmaddr); + if (pte_val(*ptep) & _PAGE_PROTECT) + ptep_remove_protection_split(mm, ptep, vmaddr); } - spin_unlock(&gmap->guest_table_lock); spin_unlock(ptl); radix_tree_preload_end(); return rc; @@ -856,7 +879,7 @@ static pte_t *gmap_pte_op_walk(struct gmap *gmap, unsigned long gaddr, } /** - * gmap_pte_op_fixup - force a page in and connect the gmap page table + * gmap_fixup - force memory in and connect the gmap table entry * @gmap: pointer to guest mapping meta data structure * @gaddr: virtual address in the guest address space * @vmaddr: address in the host process address space @@ -864,10 +887,10 @@ static pte_t *gmap_pte_op_walk(struct gmap *gmap, unsigned long gaddr, * * Returns 0 if the caller can retry __gmap_translate (might fail again), * -ENOMEM if out of memory and -EFAULT if anything goes wrong while fixing - * up or connecting the gmap page table. + * up or connecting the gmap table entry. */ -static int gmap_pte_op_fixup(struct gmap *gmap, unsigned long gaddr, - unsigned long vmaddr, int prot) +static int gmap_fixup(struct gmap *gmap, unsigned long gaddr, + unsigned long vmaddr, int prot) { struct mm_struct *mm = gmap->mm; unsigned int fault_flags; @@ -953,6 +976,76 @@ static inline void gmap_pmd_op_end(spinlock_t *ptl) spin_unlock(ptl); } +static pte_t *gmap_pte_from_pmd(struct gmap *gmap, pmd_t *pmdp, + unsigned long addr, spinlock_t **ptl) +{ + *ptl = NULL; + if (likely(!gmap_pmd_is_split(pmdp))) + return pte_alloc_map_lock(gmap->mm, pmdp, addr, ptl); + + return pte_offset_map(pmdp, addr); +} + +/** + * gmap_pmd_split_free - Free a split pmd's page table + * @pmdp The split pmd that we free of its page table + * + * If the userspace pmds are exchanged, we'll remove the gmap pmds as + * well, so we fault on them and link them again. We would leak + * memory, if we didn't free split pmds here. + */ +static inline void gmap_pmd_split_free(struct gmap *gmap, pmd_t *pmdp) +{ + unsigned long pgt = pmd_val(*pmdp) & _SEGMENT_ENTRY_ORIGIN; + struct page *page; + + if (gmap_pmd_is_split(pmdp)) { + page = pfn_to_page(pgt >> PAGE_SHIFT); + spin_lock(&gmap->split_list_lock); + list_del(&page->lru); + spin_unlock(&gmap->split_list_lock); + page_table_free_pgste(page); + } +} + +/** + * gmap_pmd_split - Split a huge gmap pmd and use a page table instead + * @gmap: pointer to guest mapping meta data structure + * @gaddr: virtual address in the guest address space + * @pmdp: pointer to the pmd that will be split + * @pgtable: Pre-allocated page table + * + * When splitting gmap pmds, we have to make the resulting page table + * look like it's a normal one to be able to use the common pte + * handling functions. Also we need to track these new tables as they + * aren't tracked anywhere else. + */ +static void gmap_pmd_split(struct gmap *gmap, unsigned long gaddr, + pmd_t *pmdp, struct page *page) +{ + unsigned long *ptable = (unsigned long *) page_to_phys(page); + pmd_t new; + int i; + + for (i = 0; i < 256; i++) { + ptable[i] = (pmd_val(*pmdp) & HPAGE_MASK) + i * PAGE_SIZE; + /* Carry over hardware permission from the pmd */ + if (pmd_val(*pmdp) & _SEGMENT_ENTRY_PROTECT) + ptable[i] |= _PAGE_PROTECT; + /* pmd_large() implies pmd/pte_present() */ + ptable[i] |= _PAGE_PRESENT | _PAGE_READ | _PAGE_WRITE; + /* ptes are directly marked as dirty */ + ptable[i + PTRS_PER_PTE] |= PGSTE_UC_BIT; + } + + pmd_val(new) = ((unsigned long)ptable | _SEGMENT_ENTRY | + (_SEGMENT_ENTRY_GMAP_SPLIT)); + spin_lock(&gmap->split_list_lock); + list_add(&page->lru, &gmap->split_list); + spin_unlock(&gmap->split_list_lock); + gmap_pmdp_xchg(gmap, pmdp, new, gaddr); +} + /* * gmap_protect_pmd - remove access rights to memory and set pmd notification bits * @pmdp: pointer to the pmd to be protected @@ -1041,7 +1134,8 @@ static int gmap_protect_pte(struct gmap *gmap, unsigned long gaddr, static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, unsigned long len, int prot, unsigned long bits) { - unsigned long vmaddr, dist; + struct page *page = NULL; + unsigned long vmaddr; spinlock_t *ptl_pmd = NULL, *ptl_pte = NULL; pmd_t *pmdp; pte_t *ptep; @@ -1050,12 +1144,12 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, BUG_ON(gmap_is_shadow(gmap)); while (len) { rc = -EAGAIN; + pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl_pmd); - if (pmdp) { + if (pmdp && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID)) { if (!pmd_large(*pmdp)) { - ptl_pte = NULL; - ptep = pte_alloc_map_lock(gmap->mm, pmdp, gaddr, - &ptl_pte); + ptep = gmap_pte_from_pmd(gmap, pmdp, gaddr, + &ptl_pte); if (ptep) rc = gmap_protect_pte(gmap, gaddr, ptep, prot, bits); @@ -1067,25 +1161,33 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, gaddr += PAGE_SIZE; } } else { - rc = gmap_protect_pmd(gmap, gaddr, pmdp, prot, - bits); - if (!rc) { - dist = HPAGE_SIZE - (gaddr & ~HPAGE_MASK); - len = len < dist ? 0 : len - dist; - gaddr = (gaddr & HPAGE_MASK) + HPAGE_SIZE; + if (!page) { + /* Drop locks for allocation. */ + gmap_pmd_op_end(ptl_pmd); + ptl_pmd = NULL; + page = page_table_alloc_pgste(gmap->mm); + if (!page) + return -ENOMEM; + continue; + } else { + gmap_pmd_split(gmap, gaddr, + pmdp, page); + page = NULL; } } gmap_pmd_op_end(ptl_pmd); } + if (page) + page_table_free_pgste(page); if (rc) { - if (rc == -EINVAL) + if (rc == -EINVAL || rc == -ENOMEM) return rc; /* -EAGAIN, fixup of userspace mm and gmap */ vmaddr = __gmap_translate(gmap, gaddr); if (IS_ERR_VALUE(vmaddr)) return vmaddr; - rc = gmap_pte_op_fixup(gmap, gaddr, vmaddr, prot); + rc = gmap_fixup(gmap, gaddr, vmaddr, prot); if (rc) return rc; } @@ -1168,7 +1270,7 @@ int gmap_read_table(struct gmap *gmap, unsigned long gaddr, unsigned long *val) rc = vmaddr; break; } - rc = gmap_pte_op_fixup(gmap, gaddr, vmaddr, PROT_READ); + rc = gmap_fixup(gmap, gaddr, vmaddr, PROT_READ); if (rc) break; } @@ -1251,7 +1353,7 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr, radix_tree_preload_end(); if (rc) { kfree(rmap); - rc = gmap_pte_op_fixup(parent, paddr, vmaddr, PROT_READ); + rc = gmap_fixup(parent, paddr, vmaddr, PROT_READ); if (rc) return rc; continue; @@ -2165,7 +2267,7 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) radix_tree_preload_end(); if (!rc) break; - rc = gmap_pte_op_fixup(parent, paddr, vmaddr, prot); + rc = gmap_fixup(parent, paddr, vmaddr, prot); if (rc) break; } @@ -2231,6 +2333,30 @@ static void gmap_shadow_notify(struct gmap *sg, unsigned long vmaddr, spin_unlock(&sg->guest_table_lock); } +/* + * ptep_notify_gmap - call all invalidation callbacks for a specific pte of a gmap + * @mm: pointer to the process mm_struct + * @addr: virtual address in the process address space + * @pte: pointer to the page table entry + * @bits: bits from the pgste that caused the notify call + * + * This function is assumed to be called with the guest_table_lock held. + */ +static void ptep_notify_gmap(struct gmap *gmap, unsigned long gaddr, + unsigned long vmaddr, unsigned long bits) +{ + struct gmap *sg, *next; + + if (!list_empty(&gmap->children) && (bits & PGSTE_VSIE_BIT)) { + spin_lock(&gmap->shadow_lock); + list_for_each_entry_safe(sg, next, &gmap->children, list) + gmap_shadow_notify(sg, vmaddr, gaddr); + spin_unlock(&gmap->shadow_lock); + } + if (bits & PGSTE_IN_BIT) + gmap_call_notifier(gmap, gaddr, gaddr + PAGE_SIZE - 1); +} + /** * ptep_notify - call all invalidation callbacks for a specific pte. * @mm: pointer to the process mm_struct @@ -2246,7 +2372,7 @@ void ptep_notify(struct mm_struct *mm, unsigned long vmaddr, { unsigned long offset, gaddr = 0; unsigned long *table; - struct gmap *gmap, *sg, *next; + struct gmap *gmap; offset = ((unsigned long) pte) & (255 * sizeof(pte_t)); offset = offset * (PAGE_SIZE / sizeof(pte_t)); @@ -2261,23 +2387,34 @@ void ptep_notify(struct mm_struct *mm, unsigned long vmaddr, if (!table) continue; - if (!list_empty(&gmap->children) && (bits & PGSTE_VSIE_BIT)) { - spin_lock(&gmap->shadow_lock); - list_for_each_entry_safe(sg, next, - &gmap->children, list) - gmap_shadow_notify(sg, vmaddr, gaddr); - spin_unlock(&gmap->shadow_lock); - } - if (bits & PGSTE_IN_BIT) - gmap_call_notifier(gmap, gaddr, gaddr + PAGE_SIZE - 1); + ptep_notify_gmap(gmap, gaddr, vmaddr, bits); } rcu_read_unlock(); } EXPORT_SYMBOL_GPL(ptep_notify); -static void pmdp_notify_gmap(struct gmap *gmap, pmd_t *pmdp, - unsigned long gaddr) +static inline void pmdp_notify_split(struct gmap *gmap, pmd_t *pmdp, + unsigned long gaddr, unsigned long vmaddr) { + int i = 0; + unsigned long bits; + pte_t *ptep = (pte_t *)(pmd_val(*pmdp) & PAGE_MASK); + + for (; i < 256; i++, gaddr += PAGE_SIZE, vmaddr += PAGE_SIZE, ptep++) { + bits = ptep_get_and_clear_notification_bits(ptep); + if (bits) + ptep_notify_gmap(gmap, gaddr, vmaddr, bits); + } +} + +static void pmdp_notify_gmap(struct gmap *gmap, pmd_t *pmdp, + unsigned long gaddr, unsigned long vmaddr) +{ + if (gmap_pmd_is_split(pmdp)) + return pmdp_notify_split(gmap, pmdp, gaddr, vmaddr); + + if (!(pmd_val(*pmdp) & _SEGMENT_ENTRY_GMAP_IN)) + return; pmd_val(*pmdp) &= ~_SEGMENT_ENTRY_GMAP_IN; gmap_call_notifier(gmap, gaddr, gaddr + HPAGE_SIZE - 1); } @@ -2296,8 +2433,9 @@ static void gmap_pmdp_xchg(struct gmap *gmap, pmd_t *pmdp, pmd_t new, unsigned long gaddr) { gaddr &= HPAGE_MASK; - pmdp_notify_gmap(gmap, pmdp, gaddr); - pmd_val(new) &= ~_SEGMENT_ENTRY_GMAP_IN; + pmdp_notify_gmap(gmap, pmdp, gaddr, 0); + if (pmd_large(new)) + pmd_val(new) &= ~GMAP_SEGMENT_NOTIFY_BITS; if (MACHINE_HAS_TLB_GUEST) __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, gmap->asce, IDTE_GLOBAL); @@ -2322,11 +2460,13 @@ static void gmap_pmdp_clear(struct mm_struct *mm, unsigned long vmaddr, vmaddr >> PMD_SHIFT); if (pmdp) { gaddr = __gmap_segment_gaddr((unsigned long *)pmdp); - pmdp_notify_gmap(gmap, pmdp, gaddr); - WARN_ON(pmd_val(*pmdp) & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | - _SEGMENT_ENTRY_GMAP_UC)); + pmdp_notify_gmap(gmap, pmdp, gaddr, vmaddr); + if (pmd_large(*pmdp)) + WARN_ON(pmd_val(*pmdp) & + GMAP_SEGMENT_NOTIFY_BITS); if (purge) __pmdp_csp(pmdp); + gmap_pmd_split_free(gmap, pmdp); pmd_val(*pmdp) = _SEGMENT_ENTRY_EMPTY; } spin_unlock(&gmap->guest_table_lock); @@ -2376,14 +2516,15 @@ void gmap_pmdp_idte_local(struct mm_struct *mm, unsigned long vmaddr) if (entry) { pmdp = (pmd_t *)entry; gaddr = __gmap_segment_gaddr(entry); - pmdp_notify_gmap(gmap, pmdp, gaddr); - WARN_ON(*entry & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | - _SEGMENT_ENTRY_GMAP_UC)); + pmdp_notify_gmap(gmap, pmdp, gaddr, vmaddr); + if (pmd_large(*pmdp)) + WARN_ON(*entry & GMAP_SEGMENT_NOTIFY_BITS); if (MACHINE_HAS_TLB_GUEST) __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, gmap->asce, IDTE_LOCAL); else if (MACHINE_HAS_IDTE) __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_LOCAL); + gmap_pmd_split_free(gmap, pmdp); *entry = _SEGMENT_ENTRY_EMPTY; } spin_unlock(&gmap->guest_table_lock); @@ -2411,9 +2552,9 @@ void gmap_pmdp_idte_global(struct mm_struct *mm, unsigned long vmaddr) if (entry) { pmdp = (pmd_t *)entry; gaddr = __gmap_segment_gaddr(entry); - pmdp_notify_gmap(gmap, pmdp, gaddr); - WARN_ON(*entry & ~(_SEGMENT_ENTRY_HARDWARE_BITS_LARGE | - _SEGMENT_ENTRY_GMAP_UC)); + pmdp_notify_gmap(gmap, pmdp, gaddr, vmaddr); + if (pmd_large(*pmdp)) + WARN_ON(*entry & GMAP_SEGMENT_NOTIFY_BITS); if (MACHINE_HAS_TLB_GUEST) __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, gmap->asce, IDTE_GLOBAL); @@ -2421,6 +2562,7 @@ void gmap_pmdp_idte_global(struct mm_struct *mm, unsigned long vmaddr) __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_GLOBAL); else __pmdp_csp(pmdp); + gmap_pmd_split_free(gmap, pmdp); *entry = _SEGMENT_ENTRY_EMPTY; } spin_unlock(&gmap->guest_table_lock); @@ -2471,9 +2613,10 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long bitmap[4], int i; pmd_t *pmdp; pte_t *ptep; - spinlock_t *ptl = NULL; + spinlock_t *ptl_pmd = NULL; + spinlock_t *ptl_pte = NULL; - pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl); + pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl_pmd); if (!pmdp) return; @@ -2482,15 +2625,15 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long bitmap[4], bitmap_fill(bitmap, _PAGE_ENTRIES); } else { for (i = 0; i < _PAGE_ENTRIES; i++, vmaddr += PAGE_SIZE) { - ptep = pte_alloc_map_lock(gmap->mm, pmdp, vmaddr, &ptl); + ptep = gmap_pte_from_pmd(gmap, pmdp, vmaddr, &ptl_pte); if (!ptep) continue; if (ptep_test_and_clear_uc(gmap->mm, vmaddr, ptep)) set_bit(i, bitmap); - spin_unlock(ptl); + gmap_pte_op_end(ptl_pte); } } - gmap_pmd_op_end(ptl); + gmap_pmd_op_end(ptl_pmd); } EXPORT_SYMBOL_GPL(gmap_sync_dirty_log_pmd); diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 4b184744350b..55855192c41f 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -719,6 +719,39 @@ void ptep_zap_key(struct mm_struct *mm, unsigned long addr, pte_t *ptep) preempt_enable(); } +unsigned long ptep_get_and_clear_notification_bits(pte_t *ptep) +{ + pgste_t pgste; + unsigned long bits; + + pgste = pgste_get_lock(ptep); + bits = pgste_val(pgste) & (PGSTE_IN_BIT | PGSTE_VSIE_BIT); + pgste_val(pgste) ^= bits; + pgste_set_unlock(ptep, pgste); + + return bits; +} +EXPORT_SYMBOL_GPL(ptep_get_and_clear_notification_bits); + +void ptep_remove_protection_split(struct mm_struct *mm, pte_t *ptep, + unsigned long gaddr) +{ + pte_t pte; + pgste_t pgste; + + pgste = pgste_get_lock(ptep); + pgste_val(pgste) |= PGSTE_UC_BIT; + pte = *ptep; + pte_val(pte) &= ~_PAGE_PROTECT; + + pgste = pgste_pte_notify(mm, gaddr, ptep, pgste); + ptep_ipte_global(mm, gaddr, ptep, 0); + + *ptep = pte; + pgste_set_unlock(ptep, pgste); +} +EXPORT_SYMBOL_GPL(ptep_remove_protection_split); + /* * Test and reset if a guest page is dirty */ From patchwork Wed Sep 19 08:47:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605505 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 537DA5A4 for ; Wed, 19 Sep 2018 08:49:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 407312B59F for ; Wed, 19 Sep 2018 08:49:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34C942B5A1; Wed, 19 Sep 2018 08:49:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C468C2B5AC for ; Wed, 19 Sep 2018 08:49:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731127AbeISO0B (ORCPT ); Wed, 19 Sep 2018 10:26:01 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:38680 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731120AbeISO0B (ORCPT ); Wed, 19 Sep 2018 10:26:01 -0400 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iWxP052060 for ; Wed, 19 Sep 2018 04:49:08 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkgtce92f-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:08 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:05 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:02 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8n18V24772770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:01 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C6EDBAE04D; Wed, 19 Sep 2018 11:48:08 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1B164AE051; Wed, 19 Sep 2018 11:48:08 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:07 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 05/14] s390/mm: Split huge pages when migrating Date: Wed, 19 Sep 2018 10:47:53 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-4275-0000-0000-000002BC69DE X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-4276-0000-0000-000037C5B6CB Message-Id: <20180919084802.183381-6-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=1 spamscore=0 clxscore=1015 lowpriorityscore=1 mlxscore=0 impostorscore=0 mlxlogscore=764 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Right now we mark the huge page that is being written to as dirty although only a single byte may have changed. This means we have to migrate 1MB although only a very limited amount of memory in that range might be dirty. To speed up migration this patch splits up write protected huge pages into normal pages. The protection for the normal pages is only removed for the page that caused the fault. Signed-off-by: Janosch Frank --- arch/s390/mm/gmap.c | 34 ++++++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 8e78a124d31a..7bc490a6fbeb 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -532,6 +532,9 @@ void gmap_unlink(struct mm_struct *mm, unsigned long *table, static void gmap_pmdp_xchg(struct gmap *gmap, pmd_t *old, pmd_t new, unsigned long gaddr); +static void gmap_pmd_split(struct gmap *gmap, unsigned long gaddr, + pmd_t *pmdp, struct page *page); + /** * gmap_link - set up shadow page tables to connect a host to a guest address * @gmap: pointer to guest mapping meta data structure @@ -547,12 +550,12 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) { struct mm_struct *mm; unsigned long *table; + struct page *page = NULL; spinlock_t *ptl; pgd_t *pgd; p4d_t *p4d; pud_t *pud; pmd_t *pmd; - u64 unprot; pte_t *ptep; int rc; @@ -600,6 +603,7 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) /* Are we allowed to use huge pages? */ if (pmd_large(*pmd) && !gmap->mm->context.allow_gmap_hpage_1m) return -EFAULT; +retry_split: /* Link gmap segment table entry location to page table. */ rc = radix_tree_preload(GFP_KERNEL); if (rc) @@ -627,10 +631,25 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) spin_unlock(&gmap->guest_table_lock); } else if (*table & _SEGMENT_ENTRY_PROTECT && !(pmd_val(*pmd) & _SEGMENT_ENTRY_PROTECT)) { - unprot = (u64)*table; - unprot &= ~_SEGMENT_ENTRY_PROTECT; - unprot |= _SEGMENT_ENTRY_GMAP_UC; - gmap_pmdp_xchg(gmap, (pmd_t *)table, __pmd(unprot), gaddr); + if (page) { + gmap_pmd_split(gmap, gaddr, (pmd_t *)table, page); + page = NULL; + } else { + spin_unlock(ptl); + ptl = NULL; + radix_tree_preload_end(); + page = page_table_alloc_pgste(mm); + if (!page) + rc = -ENOMEM; + else + goto retry_split; + } + /* + * The split moves over the protection, so we still + * need to unprotect. + */ + ptep = pte_offset_map((pmd_t *)table, vmaddr); + ptep_remove_protection_split(mm, ptep, vmaddr); } else if (gmap_pmd_is_split((pmd_t *)table)) { /* * Split pmds are somewhere in-between a normal and a @@ -642,7 +661,10 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) if (pte_val(*ptep) & _PAGE_PROTECT) ptep_remove_protection_split(mm, ptep, vmaddr); } - spin_unlock(ptl); + if (page) + page_table_free_pgste(page); + if (ptl) + spin_unlock(ptl); radix_tree_preload_end(); return rc; } From patchwork Wed Sep 19 08:47:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605503 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CB9306CB for ; Wed, 19 Sep 2018 08:49:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B817D2B59F for ; Wed, 19 Sep 2018 08:49:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AC7B72B5AA; Wed, 19 Sep 2018 08:49:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 196E02B59F for ; Wed, 19 Sep 2018 08:49:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731131AbeISO0B (ORCPT ); Wed, 19 Sep 2018 10:26:01 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:47216 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731111AbeISO0B (ORCPT ); Wed, 19 Sep 2018 10:26:01 -0400 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iSHM142182 for ; Wed, 19 Sep 2018 04:49:07 -0400 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkk570fmd-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:07 -0400 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:06 +0100 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:04 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8n3Do66912394 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:03 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BD292AE045; Wed, 19 Sep 2018 11:48:10 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0EAE6AE04D; Wed, 19 Sep 2018 11:48:10 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:09 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 06/14] s390/mm: Provide vmaddr to pmd notification Date: Wed, 19 Sep 2018 10:47:54 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0012-0000-0000-000002AAE2DC X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0013-0000-0000-000020DF3FE8 Message-Id: <20180919084802.183381-7-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=651 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It will be needed for shadow tables. Signed-off-by: Janosch Frank --- arch/s390/mm/gmap.c | 51 ++++++++++++++++++++++++++------------------------- 1 file changed, 26 insertions(+), 25 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 7bc490a6fbeb..70763bcd0e0b 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -530,10 +530,10 @@ void gmap_unlink(struct mm_struct *mm, unsigned long *table, } static void gmap_pmdp_xchg(struct gmap *gmap, pmd_t *old, pmd_t new, - unsigned long gaddr); + unsigned long gaddr, unsigned long vmaddr); static void gmap_pmd_split(struct gmap *gmap, unsigned long gaddr, - pmd_t *pmdp, struct page *page); + unsigned long vmaddr, pmd_t *pmdp, struct page *page); /** * gmap_link - set up shadow page tables to connect a host to a guest address @@ -632,7 +632,8 @@ int __gmap_link(struct gmap *gmap, unsigned long gaddr, unsigned long vmaddr) } else if (*table & _SEGMENT_ENTRY_PROTECT && !(pmd_val(*pmd) & _SEGMENT_ENTRY_PROTECT)) { if (page) { - gmap_pmd_split(gmap, gaddr, (pmd_t *)table, page); + gmap_pmd_split(gmap, gaddr, vmaddr, + (pmd_t *)table, page); page = NULL; } else { spin_unlock(ptl); @@ -948,19 +949,15 @@ static void gmap_pte_op_end(spinlock_t *ptl) * Returns a pointer to the pmd for a guest address, or NULL */ static inline pmd_t *gmap_pmd_op_walk(struct gmap *gmap, unsigned long gaddr, - spinlock_t **ptl) + unsigned long vmaddr, spinlock_t **ptl) { pmd_t *pmdp, *hpmdp; - unsigned long vmaddr; BUG_ON(gmap_is_shadow(gmap)); *ptl = NULL; if (gmap->mm->context.allow_gmap_hpage_1m) { - vmaddr = __gmap_translate(gmap, gaddr); - if (IS_ERR_VALUE(vmaddr)) - return NULL; hpmdp = pmd_alloc_map(gmap->mm, vmaddr); if (!hpmdp) return NULL; @@ -1043,7 +1040,7 @@ static inline void gmap_pmd_split_free(struct gmap *gmap, pmd_t *pmdp) * aren't tracked anywhere else. */ static void gmap_pmd_split(struct gmap *gmap, unsigned long gaddr, - pmd_t *pmdp, struct page *page) + unsigned long vmaddr, pmd_t *pmdp, struct page *page) { unsigned long *ptable = (unsigned long *) page_to_phys(page); pmd_t new; @@ -1065,7 +1062,7 @@ static void gmap_pmd_split(struct gmap *gmap, unsigned long gaddr, spin_lock(&gmap->split_list_lock); list_add(&page->lru, &gmap->split_list); spin_unlock(&gmap->split_list_lock); - gmap_pmdp_xchg(gmap, pmdp, new, gaddr); + gmap_pmdp_xchg(gmap, pmdp, new, gaddr, vmaddr); } /* @@ -1083,7 +1080,8 @@ static void gmap_pmd_split(struct gmap *gmap, unsigned long gaddr, * guest_table_lock held. */ static int gmap_protect_pmd(struct gmap *gmap, unsigned long gaddr, - pmd_t *pmdp, int prot, unsigned long bits) + unsigned long vmaddr, pmd_t *pmdp, int prot, + unsigned long bits) { int pmd_i = pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID; int pmd_p = pmd_val(*pmdp) & _SEGMENT_ENTRY_PROTECT; @@ -1095,13 +1093,13 @@ static int gmap_protect_pmd(struct gmap *gmap, unsigned long gaddr, if (prot == PROT_NONE && !pmd_i) { pmd_val(new) |= _SEGMENT_ENTRY_INVALID; - gmap_pmdp_xchg(gmap, pmdp, new, gaddr); + gmap_pmdp_xchg(gmap, pmdp, new, gaddr, vmaddr); } if (prot == PROT_READ && !pmd_p) { pmd_val(new) &= ~_SEGMENT_ENTRY_INVALID; pmd_val(new) |= _SEGMENT_ENTRY_PROTECT; - gmap_pmdp_xchg(gmap, pmdp, new, gaddr); + gmap_pmdp_xchg(gmap, pmdp, new, gaddr, vmaddr); } if (bits & GMAP_NOTIFY_MPROT) @@ -1164,10 +1162,14 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, int rc; BUG_ON(gmap_is_shadow(gmap)); + while (len) { rc = -EAGAIN; - - pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl_pmd); + vmaddr = __gmap_translate(gmap, gaddr); + if (IS_ERR_VALUE(vmaddr)) + return vmaddr; + vmaddr |= gaddr & ~PMD_MASK; + pmdp = gmap_pmd_op_walk(gmap, gaddr, vmaddr, &ptl_pmd); if (pmdp && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID)) { if (!pmd_large(*pmdp)) { ptep = gmap_pte_from_pmd(gmap, pmdp, gaddr, @@ -1192,7 +1194,7 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, return -ENOMEM; continue; } else { - gmap_pmd_split(gmap, gaddr, + gmap_pmd_split(gmap, gaddr, vmaddr, pmdp, page); page = NULL; } @@ -1206,9 +1208,6 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, return rc; /* -EAGAIN, fixup of userspace mm and gmap */ - vmaddr = __gmap_translate(gmap, gaddr); - if (IS_ERR_VALUE(vmaddr)) - return vmaddr; rc = gmap_fixup(gmap, gaddr, vmaddr, prot); if (rc) return rc; @@ -2432,6 +2431,7 @@ static inline void pmdp_notify_split(struct gmap *gmap, pmd_t *pmdp, static void pmdp_notify_gmap(struct gmap *gmap, pmd_t *pmdp, unsigned long gaddr, unsigned long vmaddr) { + BUG_ON((gaddr & ~HPAGE_MASK) || (vmaddr & ~HPAGE_MASK)); if (gmap_pmd_is_split(pmdp)) return pmdp_notify_split(gmap, pmdp, gaddr, vmaddr); @@ -2452,10 +2452,11 @@ static void pmdp_notify_gmap(struct gmap *gmap, pmd_t *pmdp, * held. */ static void gmap_pmdp_xchg(struct gmap *gmap, pmd_t *pmdp, pmd_t new, - unsigned long gaddr) + unsigned long gaddr, unsigned long vmaddr) { gaddr &= HPAGE_MASK; - pmdp_notify_gmap(gmap, pmdp, gaddr, 0); + vmaddr &= HPAGE_MASK; + pmdp_notify_gmap(gmap, pmdp, gaddr, vmaddr); if (pmd_large(new)) pmd_val(new) &= ~GMAP_SEGMENT_NOTIFY_BITS; if (MACHINE_HAS_TLB_GUEST) @@ -2603,7 +2604,7 @@ EXPORT_SYMBOL_GPL(gmap_pmdp_idte_global); * held. */ bool gmap_test_and_clear_dirty_pmd(struct gmap *gmap, pmd_t *pmdp, - unsigned long gaddr) + unsigned long gaddr, unsigned long vmaddr) { if (pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID) return false; @@ -2615,7 +2616,7 @@ bool gmap_test_and_clear_dirty_pmd(struct gmap *gmap, pmd_t *pmdp, /* Clear UC indication and reset protection */ pmd_val(*pmdp) &= ~_SEGMENT_ENTRY_GMAP_UC; - gmap_protect_pmd(gmap, gaddr, pmdp, PROT_READ, 0); + gmap_protect_pmd(gmap, gaddr, vmaddr, pmdp, PROT_READ, 0); return true; } @@ -2638,12 +2639,12 @@ void gmap_sync_dirty_log_pmd(struct gmap *gmap, unsigned long bitmap[4], spinlock_t *ptl_pmd = NULL; spinlock_t *ptl_pte = NULL; - pmdp = gmap_pmd_op_walk(gmap, gaddr, &ptl_pmd); + pmdp = gmap_pmd_op_walk(gmap, gaddr, vmaddr, &ptl_pmd); if (!pmdp) return; if (pmd_large(*pmdp)) { - if (gmap_test_and_clear_dirty_pmd(gmap, pmdp, gaddr)) + if (gmap_test_and_clear_dirty_pmd(gmap, pmdp, gaddr, vmaddr)) bitmap_fill(bitmap, _PAGE_ENTRIES); } else { for (i = 0; i < _PAGE_ENTRIES; i++, vmaddr += PAGE_SIZE) { From patchwork Wed Sep 19 08:47:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605507 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 033255A4 for ; Wed, 19 Sep 2018 08:49:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E44482B59F for ; Wed, 19 Sep 2018 08:49:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D89DC2B5AA; Wed, 19 Sep 2018 08:49:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8302A2B59F for ; Wed, 19 Sep 2018 08:49:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731139AbeISO0F (ORCPT ); Wed, 19 Sep 2018 10:26:05 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:47828 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731119AbeISO0E (ORCPT ); Wed, 19 Sep 2018 10:26:04 -0400 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iSTM142193 for ; Wed, 19 Sep 2018 04:49:10 -0400 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkk570fr5-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:10 -0400 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:09 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:06 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8n5Tp47775774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:05 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B80BFAE045; Wed, 19 Sep 2018 11:48:12 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 05D9DAE053; Wed, 19 Sep 2018 11:48:12 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:11 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 07/14] s390/mm: Add gmap_idte_global Date: Wed, 19 Sep 2018 10:47:55 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0012-0000-0000-000002AAE2DE X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0013-0000-0000-000020DF3FEA Message-Id: <20180919084802.183381-8-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=566 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Introduce a function to do a idte global flush on a gmap pmd and remove some code duplication. Signed-off-by: Janosch Frank Reviewed-by: David Hildenbrand --- arch/s390/mm/gmap.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 70763bcd0e0b..26cc6ce19afb 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -1005,6 +1005,18 @@ static pte_t *gmap_pte_from_pmd(struct gmap *gmap, pmd_t *pmdp, return pte_offset_map(pmdp, addr); } +static inline void gmap_idte_global(unsigned long asce, pmd_t *pmdp, + unsigned long gaddr) +{ + if (MACHINE_HAS_TLB_GUEST) + __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, asce, + IDTE_GLOBAL); + else if (MACHINE_HAS_IDTE) + __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_GLOBAL); + else + __pmdp_csp(pmdp); +} + /** * gmap_pmd_split_free - Free a split pmd's page table * @pmdp The split pmd that we free of its page table @@ -2459,13 +2471,7 @@ static void gmap_pmdp_xchg(struct gmap *gmap, pmd_t *pmdp, pmd_t new, pmdp_notify_gmap(gmap, pmdp, gaddr, vmaddr); if (pmd_large(new)) pmd_val(new) &= ~GMAP_SEGMENT_NOTIFY_BITS; - if (MACHINE_HAS_TLB_GUEST) - __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, gmap->asce, - IDTE_GLOBAL); - else if (MACHINE_HAS_IDTE) - __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_GLOBAL); - else - __pmdp_csp(pmdp); + gmap_idte_global(gmap->asce, pmdp, gaddr); *pmdp = new; } @@ -2578,13 +2584,7 @@ void gmap_pmdp_idte_global(struct mm_struct *mm, unsigned long vmaddr) pmdp_notify_gmap(gmap, pmdp, gaddr, vmaddr); if (pmd_large(*pmdp)) WARN_ON(*entry & GMAP_SEGMENT_NOTIFY_BITS); - if (MACHINE_HAS_TLB_GUEST) - __pmdp_idte(gaddr, pmdp, IDTE_GUEST_ASCE, - gmap->asce, IDTE_GLOBAL); - else if (MACHINE_HAS_IDTE) - __pmdp_idte(gaddr, pmdp, 0, 0, IDTE_GLOBAL); - else - __pmdp_csp(pmdp); + gmap_idte_global(gmap->asce, pmdp, gaddr); gmap_pmd_split_free(gmap, pmdp); *entry = _SEGMENT_ENTRY_EMPTY; } From patchwork Wed Sep 19 08:47:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605509 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7FAC46CB for ; Wed, 19 Sep 2018 08:49:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6CE6F2B59F for ; Wed, 19 Sep 2018 08:49:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 613262B5AA; Wed, 19 Sep 2018 08:49:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 07AF32B59F for ; Wed, 19 Sep 2018 08:49:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731168AbeISO0H (ORCPT ); Wed, 19 Sep 2018 10:26:07 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:51584 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731166AbeISO0G (ORCPT ); Wed, 19 Sep 2018 10:26:06 -0400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iiQq077502 for ; Wed, 19 Sep 2018 04:49:12 -0400 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkhra41c6-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:12 -0400 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:11 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:08 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8n7nl64422096 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:07 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BA6B4AE055; Wed, 19 Sep 2018 11:48:14 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0C248AE045; Wed, 19 Sep 2018 11:48:14 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:13 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 08/14] s390/mm: Make gmap_read_table EDAT1 compatible Date: Wed, 19 Sep 2018 10:47:56 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0028-0000-0000-000002FB67A6 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0029-0000-0000-000023B52281 Message-Id: <20180919084802.183381-9-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=591 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For the upcoming support of VSIE guests on huge page backed hosts, we need to be able to read from large segments. Signed-off-by: Janosch Frank --- arch/s390/mm/gmap.c | 43 ++++++++++++++++++++++++++----------------- 1 file changed, 26 insertions(+), 17 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 26cc6ce19afb..ba0425f1c2c0 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -1274,35 +1274,44 @@ EXPORT_SYMBOL_GPL(gmap_mprotect_notify); int gmap_read_table(struct gmap *gmap, unsigned long gaddr, unsigned long *val) { unsigned long address, vmaddr; - spinlock_t *ptl; + spinlock_t *ptl_pmd = NULL, *ptl_pte = NULL; + pmd_t *pmdp; pte_t *ptep, pte; int rc; - if (gmap_is_shadow(gmap)) - return -EINVAL; + BUG_ON(gmap_is_shadow(gmap)); while (1) { rc = -EAGAIN; - ptep = gmap_pte_op_walk(gmap, gaddr, &ptl); - if (ptep) { - pte = *ptep; - if (pte_present(pte) && (pte_val(pte) & _PAGE_READ)) { - address = pte_val(pte) & PAGE_MASK; - address += gaddr & ~PAGE_MASK; + vmaddr = __gmap_translate(gmap, gaddr); + if (IS_ERR_VALUE(vmaddr)) + return vmaddr; + pmdp = gmap_pmd_op_walk(gmap, gaddr, vmaddr, &ptl_pmd); + if (pmdp && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID)) { + if (!pmd_large(*pmdp)) { + ptep = gmap_pte_from_pmd(gmap, pmdp, vmaddr, &ptl_pte); + if (ptep) { + pte = *ptep; + if (pte_present(pte) && (pte_val(pte) & _PAGE_READ)) { + address = pte_val(pte) & PAGE_MASK; + address += gaddr & ~PAGE_MASK; + *val = *(unsigned long *) address; + pte_val(*ptep) |= _PAGE_YOUNG; + /* Do *NOT* clear the _PAGE_INVALID bit! */ + rc = 0; + } + } + gmap_pte_op_end(ptl_pte); + } else { + address = pmd_val(*pmdp) & HPAGE_MASK; + address += gaddr & ~HPAGE_MASK; *val = *(unsigned long *) address; - pte_val(*ptep) |= _PAGE_YOUNG; - /* Do *NOT* clear the _PAGE_INVALID bit! */ rc = 0; } - gmap_pte_op_end(ptl); + gmap_pmd_op_end(ptl_pmd); } if (!rc) break; - vmaddr = __gmap_translate(gmap, gaddr); - if (IS_ERR_VALUE(vmaddr)) { - rc = vmaddr; - break; - } rc = gmap_fixup(gmap, gaddr, vmaddr, PROT_READ); if (rc) break; From patchwork Wed Sep 19 08:47:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605511 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 168D26CB for ; Wed, 19 Sep 2018 08:49:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03AA22B59F for ; Wed, 19 Sep 2018 08:49:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EC1CA2B5AA; Wed, 19 Sep 2018 08:49:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7DE6F2B59F for ; Wed, 19 Sep 2018 08:49:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731176AbeISO0J (ORCPT ); Wed, 19 Sep 2018 10:26:09 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:58884 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731136AbeISO0I (ORCPT ); Wed, 19 Sep 2018 10:26:08 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iXES040998 for ; Wed, 19 Sep 2018 04:49:15 -0400 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkgtcpawy-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:15 -0400 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:13 +0100 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:10 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8n9FP60620860 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:09 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A54BFAE04D; Wed, 19 Sep 2018 11:48:16 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EC0C3AE045; Wed, 19 Sep 2018 11:48:15 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:15 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 09/14] s390/mm: Make gmap_protect_rmap EDAT1 compatible Date: Wed, 19 Sep 2018 10:47:57 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0028-0000-0000-000002FB67A8 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0029-0000-0000-000023B52282 Message-Id: <20180919084802.183381-10-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=549 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For the upcoming large page shadowing support, let's add the possibility to split a huge page and protect it with gmap_protect_rmap() for shadowing purposes. Signed-off-by: Janosch Frank --- arch/s390/mm/gmap.c | 87 +++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 67 insertions(+), 20 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index ba0425f1c2c0..c64f9a48f5f8 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -1138,7 +1138,8 @@ static int gmap_protect_pmd(struct gmap *gmap, unsigned long gaddr, * Expected to be called with sg->mm->mmap_sem in read */ static int gmap_protect_pte(struct gmap *gmap, unsigned long gaddr, - pte_t *ptep, int prot, unsigned long bits) + unsigned long vmaddr, pte_t *ptep, + int prot, unsigned long bits) { int rc; unsigned long pbits = 0; @@ -1187,7 +1188,7 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, ptep = gmap_pte_from_pmd(gmap, pmdp, gaddr, &ptl_pte); if (ptep) - rc = gmap_protect_pte(gmap, gaddr, + rc = gmap_protect_pte(gmap, gaddr, vmaddr, ptep, prot, bits); else rc = -ENOMEM; @@ -1346,6 +1347,21 @@ static inline void gmap_insert_rmap(struct gmap *sg, unsigned long vmaddr, } } +static int gmap_protect_rmap_pte(struct gmap *sg, struct gmap_rmap *rmap, + unsigned long paddr, unsigned long vmaddr, + pte_t *ptep, int prot) +{ + int rc = 0; + + spin_lock(&sg->guest_table_lock); + rc = gmap_protect_pte(sg->parent, paddr, vmaddr, ptep, + prot, GMAP_NOTIFY_SHADOW); + if (!rc) + gmap_insert_rmap(sg, vmaddr, rmap); + spin_unlock(&sg->guest_table_lock); + return rc; +} + /** * gmap_protect_rmap - restrict access rights to memory (RO) and create an rmap * @sg: pointer to the shadow guest address space structure @@ -1362,16 +1378,15 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr, struct gmap *parent; struct gmap_rmap *rmap; unsigned long vmaddr; - spinlock_t *ptl; + pmd_t *pmdp; pte_t *ptep; + spinlock_t *ptl_pmd = NULL, *ptl_pte = NULL; + struct page *page = NULL; int rc; BUG_ON(!gmap_is_shadow(sg)); parent = sg->parent; while (len) { - vmaddr = __gmap_translate(parent, paddr); - if (IS_ERR_VALUE(vmaddr)) - return vmaddr; rmap = kzalloc(sizeof(*rmap), GFP_KERNEL); if (!rmap) return -ENOMEM; @@ -1382,26 +1397,58 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr, return rc; } rc = -EAGAIN; - ptep = gmap_pte_op_walk(parent, paddr, &ptl); - if (ptep) { - spin_lock(&sg->guest_table_lock); - rc = ptep_force_prot(parent->mm, paddr, ptep, PROT_READ, - PGSTE_VSIE_BIT); - if (!rc) - gmap_insert_rmap(sg, vmaddr, rmap); - spin_unlock(&sg->guest_table_lock); - gmap_pte_op_end(ptl); + vmaddr = __gmap_translate(parent, paddr); + if (IS_ERR_VALUE(vmaddr)) + return vmaddr; + vmaddr |= paddr & ~PMD_MASK; + pmdp = gmap_pmd_op_walk(parent, paddr, vmaddr, &ptl_pmd); + if (pmdp && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID)) { + if (!pmd_large(*pmdp)) { + ptl_pte = NULL; + ptep = gmap_pte_from_pmd(parent, pmdp, paddr, + &ptl_pte); + if (ptep) + rc = gmap_protect_rmap_pte(sg, rmap, paddr, + vmaddr, ptep, + PROT_READ); + else + rc = -ENOMEM; + gmap_pte_op_end(ptl_pte); + if (!rc) { + paddr += PAGE_SIZE; + len -= PAGE_SIZE; + } + } else { + if (!page) { + /* Drop locks for allocation. */ + gmap_pmd_op_end(ptl_pmd); + ptl_pmd = NULL; + radix_tree_preload_end(); + kfree(rmap); + page = page_table_alloc_pgste(parent->mm); + if (!page) + return -ENOMEM; + continue; + } else { + gmap_pmd_split(parent, paddr, vmaddr, + pmdp, page); + page = NULL; + } + + } + gmap_pmd_op_end(ptl_pmd); } - radix_tree_preload_end(); - if (rc) { + if (page) + page_table_free_pgste(page); + else + radix_tree_preload_end(); + if (rc) kfree(rmap); + if (rc == -EAGAIN) { rc = gmap_fixup(parent, paddr, vmaddr, PROT_READ); if (rc) return rc; - continue; } - paddr += PAGE_SIZE; - len -= PAGE_SIZE; } return 0; } From patchwork Wed Sep 19 08:47:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605513 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 64D5F5A4 for ; Wed, 19 Sep 2018 08:49:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 520C62B59F for ; Wed, 19 Sep 2018 08:49:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 466142B5AA; Wed, 19 Sep 2018 08:49:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDE862B59F for ; Wed, 19 Sep 2018 08:49:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731162AbeISO0L (ORCPT ); Wed, 19 Sep 2018 10:26:11 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52648 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731172AbeISO0L (ORCPT ); Wed, 19 Sep 2018 10:26:11 -0400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8ii62077565 for ; Wed, 19 Sep 2018 04:49:17 -0400 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkhra41gv-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:16 -0400 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:15 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:12 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8nB7n31653998 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:11 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9398FAE051; Wed, 19 Sep 2018 11:48:18 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D6B82AE056; Wed, 19 Sep 2018 11:48:17 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:17 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 10/14] s390/mm: Add simple ptep shadow function Date: Wed, 19 Sep 2018 10:47:58 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0012-0000-0000-000002AAE2E0 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0013-0000-0000-000020DF3FED Message-Id: <20180919084802.183381-11-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=630 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Let's factor out setting the shadow pte, so we can reuse that function for later huge to 4k shadows where we don't have a spte or spgste. Signed-off-by: Janosch Frank --- arch/s390/include/asm/pgtable.h | 1 + arch/s390/mm/pgtable.c | 24 ++++++++++++++++-------- 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 54d8376b7a10..547810d43fa7 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1101,6 +1101,7 @@ void ptep_zap_unused(struct mm_struct *mm, unsigned long addr, void ptep_zap_key(struct mm_struct *mm, unsigned long addr, pte_t *ptep); int ptep_shadow_pte(struct mm_struct *mm, unsigned long saddr, pte_t *sptep, pte_t *tptep, pte_t pte); +void ptep_shadow_set(pte_t spte, pte_t *tptep, pte_t pte); void ptep_unshadow_pte(struct mm_struct *mm, unsigned long saddr, pte_t *ptep); unsigned long ptep_get_and_clear_notification_bits(pte_t *ptep); diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 55855192c41f..1c1c45174394 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -625,11 +625,24 @@ int ptep_force_prot(struct mm_struct *mm, unsigned long addr, return 0; } +void ptep_shadow_set(pte_t spte, pte_t *tptep, pte_t pte) +{ + pte_t tpte; + pgste_t tpgste; + + tpgste = pgste_get_lock(tptep); + pte_val(tpte) = (pte_val(spte) & PAGE_MASK) | + (pte_val(pte) & _PAGE_PROTECT); + /* don't touch the storage key - it belongs to parent pgste */ + tpgste = pgste_set_pte(tptep, tpgste, tpte); + pgste_set_unlock(tptep, tpgste); +} + int ptep_shadow_pte(struct mm_struct *mm, unsigned long saddr, pte_t *sptep, pte_t *tptep, pte_t pte) { - pgste_t spgste, tpgste; - pte_t spte, tpte; + pgste_t spgste; + pte_t spte; int rc = -EAGAIN; if (!(pte_val(*tptep) & _PAGE_INVALID)) @@ -640,12 +653,7 @@ int ptep_shadow_pte(struct mm_struct *mm, unsigned long saddr, !((pte_val(spte) & _PAGE_PROTECT) && !(pte_val(pte) & _PAGE_PROTECT))) { pgste_val(spgste) |= PGSTE_VSIE_BIT; - tpgste = pgste_get_lock(tptep); - pte_val(tpte) = (pte_val(spte) & PAGE_MASK) | - (pte_val(pte) & _PAGE_PROTECT); - /* don't touch the storage key - it belongs to parent pgste */ - tpgste = pgste_set_pte(tptep, tpgste, tpte); - pgste_set_unlock(tptep, tpgste); + ptep_shadow_set(spte, tptep, pte); rc = 1; } pgste_set_unlock(sptep, spgste); From patchwork Wed Sep 19 08:47:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605519 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C09A65A4 for ; Wed, 19 Sep 2018 08:49:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD56B2B5A1 for ; Wed, 19 Sep 2018 08:49:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A1C332B5AC; Wed, 19 Sep 2018 08:49:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6FF2B2B5A1 for ; Wed, 19 Sep 2018 08:49:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731187AbeISO0Q (ORCPT ); Wed, 19 Sep 2018 10:26:16 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:49482 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727990AbeISO0Q (ORCPT ); Wed, 19 Sep 2018 10:26:16 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8ix30105264 for ; Wed, 19 Sep 2018 04:49:20 -0400 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0b-001b2d01.pphosted.com with ESMTP id 2mkgjwphkq-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:19 -0400 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:17 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:14 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8nDa350921630 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:13 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 88D36AE04D; Wed, 19 Sep 2018 11:48:20 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C4F04AE055; Wed, 19 Sep 2018 11:48:19 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:19 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 11/14] s390/mm: Add gmap shadowing for large pmds Date: Wed, 19 Sep 2018 10:47:59 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0012-0000-0000-000002AAE2E1 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0013-0000-0000-000020DF3FEE Message-Id: <20180919084802.183381-12-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=957 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Up to now we could only shadow large pmds when the parent's mapping was done with normal sized pmds. This is done by introducing fake page tables and effectively running the level 3 guest with a standard memory backing instead of the large one. With this patch we add shadowing when the host is large page backed. This allows us to run normal and large backed VMs inside a large backed host. Signed-off-by: Janosch Frank --- arch/s390/include/asm/gmap.h | 9 +- arch/s390/kvm/gaccess.c | 52 +++++-- arch/s390/mm/gmap.c | 327 +++++++++++++++++++++++++++++++++++-------- 3 files changed, 316 insertions(+), 72 deletions(-) diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h index c667bd0181d4..3df7a004e6e5 100644 --- a/arch/s390/include/asm/gmap.h +++ b/arch/s390/include/asm/gmap.h @@ -16,11 +16,12 @@ /* Status bits only for huge segment entries */ #define _SEGMENT_ENTRY_GMAP_IN 0x8000 /* invalidation notify bit */ #define _SEGMENT_ENTRY_GMAP_UC 0x4000 /* dirty (migration) */ +#define _SEGMENT_ENTRY_GMAP_VSIE 0x2000 /* vsie bit */ /* Status bits in the gmap segment entry. */ #define _SEGMENT_ENTRY_GMAP_SPLIT 0x0001 /* split huge pmd */ #define GMAP_SEGMENT_STATUS_BITS (_SEGMENT_ENTRY_GMAP_UC | _SEGMENT_ENTRY_GMAP_SPLIT) -#define GMAP_SEGMENT_NOTIFY_BITS _SEGMENT_ENTRY_GMAP_IN +#define GMAP_SEGMENT_NOTIFY_BITS (_SEGMENT_ENTRY_GMAP_IN | _SEGMENT_ENTRY_GMAP_VSIE) /** * struct gmap_struct - guest address space @@ -146,9 +147,11 @@ int gmap_shadow_sgt(struct gmap *sg, unsigned long saddr, unsigned long sgt, int fake); int gmap_shadow_pgt(struct gmap *sg, unsigned long saddr, unsigned long pgt, int fake); -int gmap_shadow_pgt_lookup(struct gmap *sg, unsigned long saddr, - unsigned long *pgt, int *dat_protection, int *fake); +int gmap_shadow_sgt_lookup(struct gmap *sg, unsigned long saddr, + unsigned long *pgt, int *dat_protection, + int *fake, int *lvl); int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte); +int gmap_shadow_segment(struct gmap *sg, unsigned long saddr, pmd_t pmd); void gmap_register_pte_notifier(struct gmap_notifier *); void gmap_unregister_pte_notifier(struct gmap_notifier *); diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c index 07d30ffcfa41..0b4cde3e431e 100644 --- a/arch/s390/kvm/gaccess.c +++ b/arch/s390/kvm/gaccess.c @@ -981,7 +981,7 @@ int kvm_s390_check_low_addr_prot_real(struct kvm_vcpu *vcpu, unsigned long gra) */ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr, unsigned long *pgt, int *dat_protection, - int *fake) + int *fake, int *lvl) { struct gmap *parent; union asce asce; @@ -1130,14 +1130,25 @@ static int kvm_s390_shadow_tables(struct gmap *sg, unsigned long saddr, if (ste.cs && asce.p) return PGM_TRANSLATION_SPEC; *dat_protection |= ste.fc0.p; + + /* Guest is huge page mapped */ if (ste.fc && sg->edat_level >= 1) { - *fake = 1; - ptr = ste.fc1.sfaa * _SEGMENT_SIZE; - ste.val = ptr; - goto shadow_pgt; + /* 4k to 1m, we absolutely need fake shadow tables. */ + if (!parent->mm->context.allow_gmap_hpage_1m) { + *fake = 1; + ptr = ste.fc1.sfaa * _SEGMENT_SIZE; + ste.val = ptr; + goto shadow_pgt; + } else { + *lvl = 1; + *pgt = ptr; + return 0; + + } } ptr = ste.fc0.pto * (PAGE_SIZE / 2); shadow_pgt: + *lvl = 0; ste.fc0.p |= *dat_protection; rc = gmap_shadow_pgt(sg, saddr, ste.val, *fake); if (rc) @@ -1166,8 +1177,9 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg, { union vaddress vaddr; union page_table_entry pte; + union segment_table_entry ste; unsigned long pgt; - int dat_protection, fake; + int dat_protection, fake, lvl = 0; int rc; down_read(&sg->mm->mmap_sem); @@ -1178,12 +1190,35 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg, */ ipte_lock(vcpu); - rc = gmap_shadow_pgt_lookup(sg, saddr, &pgt, &dat_protection, &fake); + rc = gmap_shadow_sgt_lookup(sg, saddr, &pgt, &dat_protection, &fake, &lvl); if (rc) rc = kvm_s390_shadow_tables(sg, saddr, &pgt, &dat_protection, - &fake); + &fake, &lvl); vaddr.addr = saddr; + + /* Shadow stopped at segment level, we map pmd to pmd */ + if (!rc && lvl) { + rc = gmap_read_table(sg->parent, pgt + vaddr.sx * 8, &ste.val); + if (!rc && ste.i) + rc = PGM_PAGE_TRANSLATION; + ste.fc1.p |= dat_protection; + if (!rc) + rc = gmap_shadow_segment(sg, saddr, __pmd(ste.val)); + if (rc == -EISDIR) { + /* Hit a split pmd, we need to setup a fake page table */ + fake = 1; + pgt = ste.fc1.sfaa * _SEGMENT_SIZE; + ste.val = pgt; + rc = gmap_shadow_pgt(sg, saddr, ste.val, fake); + if (rc) + goto out; + } else { + /* We're done */ + goto out; + } + } + if (fake) { pte.val = pgt + vaddr.px * PAGE_SIZE; goto shadow_page; @@ -1198,6 +1233,7 @@ int kvm_s390_shadow_fault(struct kvm_vcpu *vcpu, struct gmap *sg, pte.p |= dat_protection; if (!rc) rc = gmap_shadow_page(sg, saddr, __pte(pte.val)); +out: ipte_unlock(vcpu); up_read(&sg->mm->mmap_sem); return rc; diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index c64f9a48f5f8..f697c73afba3 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -879,28 +879,6 @@ static inline unsigned long *gmap_table_walk(struct gmap *gmap, return table; } -/** - * gmap_pte_op_walk - walk the gmap page table, get the page table lock - * and return the pte pointer - * @gmap: pointer to guest mapping meta data structure - * @gaddr: virtual address in the guest address space - * @ptl: pointer to the spinlock pointer - * - * Returns a pointer to the locked pte for a guest address, or NULL - */ -static pte_t *gmap_pte_op_walk(struct gmap *gmap, unsigned long gaddr, - spinlock_t **ptl) -{ - unsigned long *table; - - BUG_ON(gmap_is_shadow(gmap)); - /* Walk the gmap page table, lock and get pte pointer */ - table = gmap_table_walk(gmap, gaddr, 1); /* get segment pointer */ - if (!table || *table & _SEGMENT_ENTRY_INVALID) - return NULL; - return pte_alloc_map_lock(gmap->mm, (pmd_t *) table, gaddr, ptl); -} - /** * gmap_fixup - force memory in and connect the gmap table entry * @gmap: pointer to guest mapping meta data structure @@ -1454,6 +1432,7 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr, } #define _SHADOW_RMAP_MASK 0x7 +#define _SHADOW_RMAP_SEGMENT_LP 0x6 #define _SHADOW_RMAP_REGION1 0x5 #define _SHADOW_RMAP_REGION2 0x4 #define _SHADOW_RMAP_REGION3 0x3 @@ -1559,15 +1538,18 @@ static void __gmap_unshadow_sgt(struct gmap *sg, unsigned long raddr, BUG_ON(!gmap_is_shadow(sg)); for (i = 0; i < _CRST_ENTRIES; i++, raddr += _SEGMENT_SIZE) { - if (!(sgt[i] & _SEGMENT_ENTRY_ORIGIN)) + if (sgt[i] == _SEGMENT_ENTRY_EMPTY) continue; - pgt = (unsigned long *)(sgt[i] & _REGION_ENTRY_ORIGIN); + + if (!(sgt[i] & _SEGMENT_ENTRY_LARGE)) { + pgt = (unsigned long *)(sgt[i] & _SEGMENT_ENTRY_ORIGIN); + __gmap_unshadow_pgt(sg, raddr, pgt); + /* Free page table */ + page = pfn_to_page(__pa(pgt) >> PAGE_SHIFT); + list_del(&page->lru); + page_table_free_pgste(page); + } sgt[i] = _SEGMENT_ENTRY_EMPTY; - __gmap_unshadow_pgt(sg, raddr, pgt); - /* Free page table */ - page = pfn_to_page(__pa(pgt) >> PAGE_SHIFT); - list_del(&page->lru); - page_table_free_pgste(page); } } @@ -2173,7 +2155,7 @@ EXPORT_SYMBOL_GPL(gmap_shadow_sgt); /** * gmap_shadow_lookup_pgtable - find a shadow page table * @sg: pointer to the shadow guest address space structure - * @saddr: the address in the shadow aguest address space + * @saddr: the address in the shadow guest address space * @pgt: parent gmap address of the page table to get shadowed * @dat_protection: if the pgtable is marked as protected by dat * @fake: pgt references contiguous guest memory block, not a pgtable @@ -2183,32 +2165,64 @@ EXPORT_SYMBOL_GPL(gmap_shadow_sgt); * * Called with sg->mm->mmap_sem in read. */ -int gmap_shadow_pgt_lookup(struct gmap *sg, unsigned long saddr, - unsigned long *pgt, int *dat_protection, - int *fake) +void gmap_shadow_pgt_lookup(struct gmap *sg, unsigned long *sge, + unsigned long saddr, unsigned long *pgt, + int *dat_protection, int *fake) { - unsigned long *table; struct page *page; - int rc; + + /* Shadow page tables are full pages (pte+pgste) */ + page = pfn_to_page(*sge >> PAGE_SHIFT); + *pgt = page->index & ~GMAP_SHADOW_FAKE_TABLE; + *dat_protection = !!(*sge & _SEGMENT_ENTRY_PROTECT); + *fake = !!(page->index & GMAP_SHADOW_FAKE_TABLE); +} +EXPORT_SYMBOL_GPL(gmap_shadow_pgt_lookup); + +int gmap_shadow_sgt_lookup(struct gmap *sg, unsigned long saddr, + unsigned long *pgt, int *dat_protection, + int *fake, int *lvl) +{ + unsigned long *sge, *r3e = NULL; + struct page *page; + int rc = -EAGAIN; BUG_ON(!gmap_is_shadow(sg)); spin_lock(&sg->guest_table_lock); - table = gmap_table_walk(sg, saddr, 1); /* get segment pointer */ - if (table && !(*table & _SEGMENT_ENTRY_INVALID)) { - /* Shadow page tables are full pages (pte+pgste) */ - page = pfn_to_page(*table >> PAGE_SHIFT); - *pgt = page->index & ~GMAP_SHADOW_FAKE_TABLE; - *dat_protection = !!(*table & _SEGMENT_ENTRY_PROTECT); - *fake = !!(page->index & GMAP_SHADOW_FAKE_TABLE); - rc = 0; - } else { - rc = -EAGAIN; + if (sg->asce & _ASCE_TYPE_MASK) { + /* >2 GB guest */ + r3e = (unsigned long *) gmap_table_walk(sg, saddr, 2); + if (!r3e || (*r3e & _REGION_ENTRY_INVALID)) + goto out; + sge = (unsigned long *)(*r3e & _REGION_ENTRY_ORIGIN) + ((saddr & _SEGMENT_INDEX) >> _SEGMENT_SHIFT); + } else { + sge = (unsigned long *)(sg->asce & PAGE_MASK) + ((saddr & _SEGMENT_INDEX) >> _SEGMENT_SHIFT); } + if (*sge & _SEGMENT_ENTRY_INVALID) + goto out; + rc = 0; + if (*sge & _SEGMENT_ENTRY_LARGE) { + if (r3e) { + page = pfn_to_page(*r3e >> PAGE_SHIFT); + *pgt = page->index & ~GMAP_SHADOW_FAKE_TABLE; + *dat_protection = !!(*r3e & _SEGMENT_ENTRY_PROTECT); + *fake = !!(page->index & GMAP_SHADOW_FAKE_TABLE); + } else { + *pgt = sg->orig_asce & PAGE_MASK; + *dat_protection = 0; + *fake = 0; + } + *lvl = 1; + } else { + gmap_shadow_pgt_lookup(sg, sge, saddr, pgt, + dat_protection, fake); + *lvl = 0; + } +out: spin_unlock(&sg->guest_table_lock); return rc; - } -EXPORT_SYMBOL_GPL(gmap_shadow_pgt_lookup); +EXPORT_SYMBOL_GPL(gmap_shadow_sgt_lookup); /** * gmap_shadow_pgt - instantiate a shadow page table @@ -2290,6 +2304,94 @@ int gmap_shadow_pgt(struct gmap *sg, unsigned long saddr, unsigned long pgt, } EXPORT_SYMBOL_GPL(gmap_shadow_pgt); +int gmap_shadow_segment(struct gmap *sg, unsigned long saddr, pmd_t pmd) +{ + struct gmap *parent; + struct gmap_rmap *rmap; + unsigned long vmaddr, paddr; + spinlock_t *ptl = NULL; + pmd_t spmd, tpmd, *spmdp = NULL, *tpmdp; + int prot; + int rc; + + BUG_ON(!gmap_is_shadow(sg)); + parent = sg->parent; + + prot = (pmd_val(pmd) & _SEGMENT_ENTRY_PROTECT) ? PROT_READ : PROT_WRITE; + rmap = kzalloc(sizeof(*rmap), GFP_KERNEL); + if (!rmap) + return -ENOMEM; + rmap->raddr = (saddr & HPAGE_MASK) | _SHADOW_RMAP_SEGMENT_LP; + + while (1) { + paddr = pmd_val(pmd) & HPAGE_MASK; + vmaddr = __gmap_translate(parent, paddr); + if (IS_ERR_VALUE(vmaddr)) { + rc = vmaddr; + break; + } + rc = radix_tree_preload(GFP_KERNEL); + if (rc) + break; + rc = -EAGAIN; + + /* Let's look up the parent's mapping */ + spmdp = gmap_pmd_op_walk(parent, paddr, vmaddr, &ptl); + if (spmdp) { + if (gmap_pmd_is_split(spmdp)) { + gmap_pmd_op_end(ptl); + radix_tree_preload_end(); + rc = -EISDIR; + break; + } + spin_lock(&sg->guest_table_lock); + /* Get shadow segment table pointer */ + tpmdp = (pmd_t *) gmap_table_walk(sg, saddr, 1); + if (!tpmdp) { + spin_unlock(&sg->guest_table_lock); + gmap_pmd_op_end(ptl); + radix_tree_preload_end(); + break; + } + /* Shadowing magic happens here. */ + if (!(pmd_val(*tpmdp) & _SEGMENT_ENTRY_INVALID)) { + rc = 0; /* already shadowed */ + spin_unlock(&sg->guest_table_lock); + gmap_pmd_op_end(ptl); + radix_tree_preload_end(); + break; + } + spmd = *spmdp; + if (!(pmd_val(spmd) & _SEGMENT_ENTRY_INVALID) && + !((pmd_val(spmd) & _SEGMENT_ENTRY_PROTECT) && + !(pmd_val(pmd) & _SEGMENT_ENTRY_PROTECT))) { + + pmd_val(*spmdp) |= _SEGMENT_ENTRY_GMAP_VSIE; + + /* Insert shadow ste */ + pmd_val(tpmd) = ((pmd_val(spmd) & + _SEGMENT_ENTRY_HARDWARE_BITS_LARGE) | + (pmd_val(pmd) & _SEGMENT_ENTRY_PROTECT)); + *tpmdp = tpmd; + gmap_insert_rmap(sg, vmaddr, rmap); + rc = 0; + } + spin_unlock(&sg->guest_table_lock); + gmap_pmd_op_end(ptl); + } + radix_tree_preload_end(); + if (!rc) + break; + rc = gmap_fixup(parent, paddr, vmaddr, prot); + if (rc) + break; + } + if (rc) + kfree(rmap); + return rc; +} +EXPORT_SYMBOL_GPL(gmap_shadow_segment); + /** * gmap_shadow_page - create a shadow page mapping * @sg: pointer to the shadow guest address space structure @@ -2307,7 +2409,8 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) struct gmap *parent; struct gmap_rmap *rmap; unsigned long vmaddr, paddr; - spinlock_t *ptl; + spinlock_t *ptl_pmd = NULL, *ptl_pte = NULL; + pmd_t *spmdp; pte_t *sptep, *tptep; int prot; int rc; @@ -2332,26 +2435,46 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) if (rc) break; rc = -EAGAIN; - sptep = gmap_pte_op_walk(parent, paddr, &ptl); - if (sptep) { - spin_lock(&sg->guest_table_lock); + spmdp = gmap_pmd_op_walk(parent, paddr, vmaddr, &ptl_pmd); + if (spmdp && !(pmd_val(*spmdp) & _SEGMENT_ENTRY_INVALID)) { /* Get page table pointer */ tptep = (pte_t *) gmap_table_walk(sg, saddr, 0); if (!tptep) { - spin_unlock(&sg->guest_table_lock); - gmap_pte_op_end(ptl); radix_tree_preload_end(); + gmap_pmd_op_end(ptl_pmd); break; } - rc = ptep_shadow_pte(sg->mm, saddr, sptep, tptep, pte); - if (rc > 0) { - /* Success and a new mapping */ - gmap_insert_rmap(sg, vmaddr, rmap); - rmap = NULL; - rc = 0; + + if (pmd_large(*spmdp)) { + pte_t spte; + if (!(pmd_val(*spmdp) & _SEGMENT_ENTRY_PROTECT)) { + spin_lock(&sg->guest_table_lock); + spte = __pte((pmd_val(*spmdp) & + _SEGMENT_ENTRY_ORIGIN_LARGE) + + (pte_index(paddr) << 12)); + ptep_shadow_set(spte, tptep, pte); + pmd_val(*spmdp) |= _SEGMENT_ENTRY_GMAP_VSIE; + gmap_insert_rmap(sg, vmaddr, rmap); + rmap = NULL; + rc = 0; + spin_unlock(&sg->guest_table_lock); + } + } else { + sptep = gmap_pte_from_pmd(parent, spmdp, paddr, &ptl_pte); + spin_lock(&sg->guest_table_lock); + if (sptep) { + rc = ptep_shadow_pte(sg->mm, saddr, sptep, tptep, pte); + if (rc > 0) { + /* Success and a new mapping */ + gmap_insert_rmap(sg, vmaddr, rmap); + rmap = NULL; + rc = 0; + } + spin_unlock(&sg->guest_table_lock); + gmap_pte_op_end(ptl_pte); + } } - gmap_pte_op_end(ptl); - spin_unlock(&sg->guest_table_lock); + gmap_pmd_op_end(ptl_pmd); } radix_tree_preload_end(); if (!rc) @@ -2365,6 +2488,75 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) } EXPORT_SYMBOL_GPL(gmap_shadow_page); +/** + * gmap_unshadow_segment - remove a huge segment from a shadow segment table + * @sg: pointer to the shadow guest address space structure + * @raddr: rmap address in the shadow guest address space + * + * Called with the sg->guest_table_lock + */ +static void gmap_unshadow_segment(struct gmap *sg, unsigned long raddr) +{ + unsigned long *table; + + BUG_ON(!gmap_is_shadow(sg)); + /* We already have the lock */ + table = gmap_table_walk(sg, raddr, 1); /* get segment table pointer */ + if (!table || *table & _SEGMENT_ENTRY_INVALID || + !(*table & _SEGMENT_ENTRY_LARGE)) + return; + gmap_call_notifier(sg, raddr, raddr + HPAGE_SIZE - 1); + gmap_idte_global(sg->asce, (pmd_t *)table, raddr); + *table = _SEGMENT_ENTRY_EMPTY; +} + +static void gmap_shadow_notify_pmd(struct gmap *sg, unsigned long vmaddr, + unsigned long gaddr) +{ + struct gmap_rmap *rmap, *rnext, *head; + unsigned long start, end, bits, raddr; + + + BUG_ON(!gmap_is_shadow(sg)); + + spin_lock(&sg->guest_table_lock); + if (sg->removed) { + spin_unlock(&sg->guest_table_lock); + return; + } + /* Check for top level table */ + start = sg->orig_asce & _ASCE_ORIGIN; + end = start + ((sg->orig_asce & _ASCE_TABLE_LENGTH) + 1) * PAGE_SIZE; + if (!(sg->orig_asce & _ASCE_REAL_SPACE) && gaddr >= start && + gaddr < ((end & HPAGE_MASK) + HPAGE_SIZE - 1)) { + /* The complete shadow table has to go */ + gmap_unshadow(sg); + spin_unlock(&sg->guest_table_lock); + list_del(&sg->list); + gmap_put(sg); + return; + } + /* Remove the page table tree from on specific entry */ + head = radix_tree_delete(&sg->host_to_rmap, (vmaddr & HPAGE_MASK) >> PAGE_SHIFT); + gmap_for_each_rmap_safe(rmap, rnext, head) { + bits = rmap->raddr & _SHADOW_RMAP_MASK; + raddr = rmap->raddr ^ bits; + switch (bits) { + case _SHADOW_RMAP_SEGMENT_LP: + gmap_unshadow_segment(sg, raddr); + break; + case _SHADOW_RMAP_PGTABLE: + gmap_unshadow_page(sg, raddr); + break; + default: + BUG(); + } + kfree(rmap); + } + spin_unlock(&sg->guest_table_lock); +} + + /** * gmap_shadow_notify - handle notifications for shadow gmap * @@ -2416,6 +2608,8 @@ static void gmap_shadow_notify(struct gmap *sg, unsigned long vmaddr, case _SHADOW_RMAP_PGTABLE: gmap_unshadow_page(sg, raddr); break; + default: + BUG(); } kfree(rmap); } @@ -2499,10 +2693,21 @@ static inline void pmdp_notify_split(struct gmap *gmap, pmd_t *pmdp, static void pmdp_notify_gmap(struct gmap *gmap, pmd_t *pmdp, unsigned long gaddr, unsigned long vmaddr) { + struct gmap *sg, *next; + BUG_ON((gaddr & ~HPAGE_MASK) || (vmaddr & ~HPAGE_MASK)); if (gmap_pmd_is_split(pmdp)) return pmdp_notify_split(gmap, pmdp, gaddr, vmaddr); + if (!list_empty(&gmap->children) && + (pmd_val(*pmdp) & _SEGMENT_ENTRY_GMAP_VSIE)) { + spin_lock(&gmap->shadow_lock); + list_for_each_entry_safe(sg, next, &gmap->children, list) + gmap_shadow_notify_pmd(sg, vmaddr, gaddr); + spin_unlock(&gmap->shadow_lock); + } + pmd_val(*pmdp) &= ~_SEGMENT_ENTRY_GMAP_VSIE; + if (!(pmd_val(*pmdp) & _SEGMENT_ENTRY_GMAP_IN)) return; pmd_val(*pmdp) &= ~_SEGMENT_ENTRY_GMAP_IN; From patchwork Wed Sep 19 08:48:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605521 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 403F115A6 for ; Wed, 19 Sep 2018 08:49:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2CED92B59F for ; Wed, 19 Sep 2018 08:49:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 217422B5AA; Wed, 19 Sep 2018 08:49:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63CE22B59F for ; Wed, 19 Sep 2018 08:49:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731184AbeISO0P (ORCPT ); Wed, 19 Sep 2018 10:26:15 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:40642 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731172AbeISO0P (ORCPT ); Wed, 19 Sep 2018 10:26:15 -0400 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8iWJr052048 for ; Wed, 19 Sep 2018 04:49:21 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkgtce9dc-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:21 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:19 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:16 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8nFcr47775780 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:15 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 83B11AE053; Wed, 19 Sep 2018 11:48:22 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CB5B2AE055; Wed, 19 Sep 2018 11:48:21 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:21 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 12/14] s390/mm: Add gmap lock classes Date: Wed, 19 Sep 2018 10:48:00 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0016-0000-0000-000002076C35 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0017-0000-0000-0000325E5956 Message-Id: <20180919084802.183381-13-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A shadow gmap and its parent are locked right after each other when doing VSIE management. Lockdep can't differentiate between the two classes without some help. Signed-off-by: Janosch Frank --- arch/s390/include/asm/gmap.h | 6 ++++++ arch/s390/mm/gmap.c | 40 +++++++++++++++++++++------------------- 2 files changed, 27 insertions(+), 19 deletions(-) diff --git a/arch/s390/include/asm/gmap.h b/arch/s390/include/asm/gmap.h index 3df7a004e6e5..95d5e849088e 100644 --- a/arch/s390/include/asm/gmap.h +++ b/arch/s390/include/asm/gmap.h @@ -23,6 +23,12 @@ #define GMAP_SEGMENT_STATUS_BITS (_SEGMENT_ENTRY_GMAP_UC | _SEGMENT_ENTRY_GMAP_SPLIT) #define GMAP_SEGMENT_NOTIFY_BITS (_SEGMENT_ENTRY_GMAP_IN | _SEGMENT_ENTRY_GMAP_VSIE) + +enum gmap_lock_class { + GMAP_LOCK_PARENT, + GMAP_LOCK_SHADOW +}; + /** * struct gmap_struct - guest address space * @list: list head for the mm->context gmap list diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index f697c73afba3..0220a32aa2b9 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -1331,7 +1331,7 @@ static int gmap_protect_rmap_pte(struct gmap *sg, struct gmap_rmap *rmap, { int rc = 0; - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); rc = gmap_protect_pte(sg->parent, paddr, vmaddr, ptep, prot, GMAP_NOTIFY_SHADOW); if (!rc) @@ -1860,7 +1860,7 @@ struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, /* only allow one real-space gmap shadow */ list_for_each_entry(sg, &parent->children, list) { if (sg->orig_asce & _ASCE_REAL_SPACE) { - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); gmap_unshadow(sg); spin_unlock(&sg->guest_table_lock); list_del(&sg->list); @@ -1932,7 +1932,7 @@ int gmap_shadow_r2t(struct gmap *sg, unsigned long saddr, unsigned long r2t, page->index |= GMAP_SHADOW_FAKE_TABLE; s_r2t = (unsigned long *) page_to_phys(page); /* Install shadow region second table */ - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); table = gmap_table_walk(sg, saddr, 4); /* get region-1 pointer */ if (!table) { rc = -EAGAIN; /* Race with unshadow */ @@ -1965,7 +1965,7 @@ int gmap_shadow_r2t(struct gmap *sg, unsigned long saddr, unsigned long r2t, offset = ((r2t & _REGION_ENTRY_OFFSET) >> 6) * PAGE_SIZE; len = ((r2t & _REGION_ENTRY_LENGTH) + 1) * PAGE_SIZE - offset; rc = gmap_protect_rmap(sg, raddr, origin + offset, len); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); if (!rc) { table = gmap_table_walk(sg, saddr, 4); if (!table || (*table & _REGION_ENTRY_ORIGIN) != @@ -2016,7 +2016,7 @@ int gmap_shadow_r3t(struct gmap *sg, unsigned long saddr, unsigned long r3t, page->index |= GMAP_SHADOW_FAKE_TABLE; s_r3t = (unsigned long *) page_to_phys(page); /* Install shadow region second table */ - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); table = gmap_table_walk(sg, saddr, 3); /* get region-2 pointer */ if (!table) { rc = -EAGAIN; /* Race with unshadow */ @@ -2048,7 +2048,7 @@ int gmap_shadow_r3t(struct gmap *sg, unsigned long saddr, unsigned long r3t, offset = ((r3t & _REGION_ENTRY_OFFSET) >> 6) * PAGE_SIZE; len = ((r3t & _REGION_ENTRY_LENGTH) + 1) * PAGE_SIZE - offset; rc = gmap_protect_rmap(sg, raddr, origin + offset, len); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); if (!rc) { table = gmap_table_walk(sg, saddr, 3); if (!table || (*table & _REGION_ENTRY_ORIGIN) != @@ -2099,7 +2099,7 @@ int gmap_shadow_sgt(struct gmap *sg, unsigned long saddr, unsigned long sgt, page->index |= GMAP_SHADOW_FAKE_TABLE; s_sgt = (unsigned long *) page_to_phys(page); /* Install shadow region second table */ - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); table = gmap_table_walk(sg, saddr, 2); /* get region-3 pointer */ if (!table) { rc = -EAGAIN; /* Race with unshadow */ @@ -2132,7 +2132,7 @@ int gmap_shadow_sgt(struct gmap *sg, unsigned long saddr, unsigned long sgt, offset = ((sgt & _REGION_ENTRY_OFFSET) >> 6) * PAGE_SIZE; len = ((sgt & _REGION_ENTRY_LENGTH) + 1) * PAGE_SIZE - offset; rc = gmap_protect_rmap(sg, raddr, origin + offset, len); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); if (!rc) { table = gmap_table_walk(sg, saddr, 2); if (!table || (*table & _REGION_ENTRY_ORIGIN) != @@ -2188,7 +2188,7 @@ int gmap_shadow_sgt_lookup(struct gmap *sg, unsigned long saddr, int rc = -EAGAIN; BUG_ON(!gmap_is_shadow(sg)); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); if (sg->asce & _ASCE_TYPE_MASK) { /* >2 GB guest */ r3e = (unsigned long *) gmap_table_walk(sg, saddr, 2); @@ -2255,7 +2255,7 @@ int gmap_shadow_pgt(struct gmap *sg, unsigned long saddr, unsigned long pgt, page->index |= GMAP_SHADOW_FAKE_TABLE; s_pgt = (unsigned long *) page_to_phys(page); /* Install shadow page table */ - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); table = gmap_table_walk(sg, saddr, 1); /* get segment pointer */ if (!table) { rc = -EAGAIN; /* Race with unshadow */ @@ -2283,7 +2283,7 @@ int gmap_shadow_pgt(struct gmap *sg, unsigned long saddr, unsigned long pgt, raddr = (saddr & _SEGMENT_MASK) | _SHADOW_RMAP_SEGMENT; origin = pgt & _SEGMENT_ENTRY_ORIGIN & PAGE_MASK; rc = gmap_protect_rmap(sg, raddr, origin, PAGE_SIZE); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); if (!rc) { table = gmap_table_walk(sg, saddr, 1); if (!table || (*table & _SEGMENT_ENTRY_ORIGIN) != @@ -2344,7 +2344,7 @@ int gmap_shadow_segment(struct gmap *sg, unsigned long saddr, pmd_t pmd) rc = -EISDIR; break; } - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); /* Get shadow segment table pointer */ tpmdp = (pmd_t *) gmap_table_walk(sg, saddr, 1); if (!tpmdp) { @@ -2448,7 +2448,8 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) if (pmd_large(*spmdp)) { pte_t spte; if (!(pmd_val(*spmdp) & _SEGMENT_ENTRY_PROTECT)) { - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, + GMAP_LOCK_SHADOW); spte = __pte((pmd_val(*spmdp) & _SEGMENT_ENTRY_ORIGIN_LARGE) + (pte_index(paddr) << 12)); @@ -2461,7 +2462,8 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) } } else { sptep = gmap_pte_from_pmd(parent, spmdp, paddr, &ptl_pte); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, + GMAP_LOCK_SHADOW); if (sptep) { rc = ptep_shadow_pte(sg->mm, saddr, sptep, tptep, pte); if (rc > 0) { @@ -2519,7 +2521,7 @@ static void gmap_shadow_notify_pmd(struct gmap *sg, unsigned long vmaddr, BUG_ON(!gmap_is_shadow(sg)); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); if (sg->removed) { spin_unlock(&sg->guest_table_lock); return; @@ -2570,7 +2572,7 @@ static void gmap_shadow_notify(struct gmap *sg, unsigned long vmaddr, BUG_ON(!gmap_is_shadow(sg)); - spin_lock(&sg->guest_table_lock); + spin_lock_nested(&sg->guest_table_lock, GMAP_LOCK_SHADOW); if (sg->removed) { spin_unlock(&sg->guest_table_lock); return; @@ -2745,7 +2747,7 @@ static void gmap_pmdp_clear(struct mm_struct *mm, unsigned long vmaddr, rcu_read_lock(); list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) { - spin_lock(&gmap->guest_table_lock); + spin_lock_nested(&gmap->guest_table_lock, GMAP_LOCK_PARENT); pmdp = (pmd_t *)radix_tree_delete(&gmap->host_to_guest, vmaddr >> PMD_SHIFT); if (pmdp) { @@ -2800,7 +2802,7 @@ void gmap_pmdp_idte_local(struct mm_struct *mm, unsigned long vmaddr) rcu_read_lock(); list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) { - spin_lock(&gmap->guest_table_lock); + spin_lock_nested(&gmap->guest_table_lock, GMAP_LOCK_PARENT); entry = radix_tree_delete(&gmap->host_to_guest, vmaddr >> PMD_SHIFT); if (entry) { @@ -2836,7 +2838,7 @@ void gmap_pmdp_idte_global(struct mm_struct *mm, unsigned long vmaddr) rcu_read_lock(); list_for_each_entry_rcu(gmap, &mm->context.gmap_list, list) { - spin_lock(&gmap->guest_table_lock); + spin_lock_nested(&gmap->guest_table_lock, GMAP_LOCK_PARENT); entry = radix_tree_delete(&gmap->host_to_guest, vmaddr >> PMD_SHIFT); if (entry) { From patchwork Wed Sep 19 08:48:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605517 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71C6E6CB for ; Wed, 19 Sep 2018 08:49:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5EFCB2B59F for ; Wed, 19 Sep 2018 08:49:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 538362B5AA; Wed, 19 Sep 2018 08:49:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA01B2B59F for ; Wed, 19 Sep 2018 08:49:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731084AbeISO0Q (ORCPT ); Wed, 19 Sep 2018 10:26:16 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:56624 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731074AbeISO0P (ORCPT ); Wed, 19 Sep 2018 10:26:15 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8ikLb019288 for ; Wed, 19 Sep 2018 04:49:21 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0b-001b2d01.pphosted.com with ESMTP id 2mkgkmxfs6-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:21 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:19 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:17 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8nH4t56754256 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:17 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 76D8BAE04D; Wed, 19 Sep 2018 11:48:24 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BE150AE051; Wed, 19 Sep 2018 11:48:23 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:23 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 13/14] s390/mm: Pull pmd invalid check in gmap_pmd_op_walk Date: Wed, 19 Sep 2018 10:48:01 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0016-0000-0000-000002076C36 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0017-0000-0000-0000325E5957 Message-Id: <20180919084802.183381-14-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=804 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190091 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Not yet sure if I'll keep this. The walk should only walk and not check I, but then it looks way nicer. Signed-off-by: Janosch Frank --- arch/s390/mm/gmap.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 0220a32aa2b9..8d5ce51637eb 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -952,7 +952,8 @@ static inline pmd_t *gmap_pmd_op_walk(struct gmap *gmap, unsigned long gaddr, } pmdp = (pmd_t *) gmap_table_walk(gmap, gaddr, 1); - if (!pmdp || pmd_none(*pmdp)) { + if (!pmdp || pmd_none(*pmdp) || + pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID) { if (*ptl) spin_unlock(*ptl); pmdp = NULL; @@ -1161,7 +1162,7 @@ static int gmap_protect_range(struct gmap *gmap, unsigned long gaddr, return vmaddr; vmaddr |= gaddr & ~PMD_MASK; pmdp = gmap_pmd_op_walk(gmap, gaddr, vmaddr, &ptl_pmd); - if (pmdp && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID)) { + if (pmdp) { if (!pmd_large(*pmdp)) { ptep = gmap_pte_from_pmd(gmap, pmdp, gaddr, &ptl_pte); @@ -1266,7 +1267,7 @@ int gmap_read_table(struct gmap *gmap, unsigned long gaddr, unsigned long *val) if (IS_ERR_VALUE(vmaddr)) return vmaddr; pmdp = gmap_pmd_op_walk(gmap, gaddr, vmaddr, &ptl_pmd); - if (pmdp && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID)) { + if (pmdp) { if (!pmd_large(*pmdp)) { ptep = gmap_pte_from_pmd(gmap, pmdp, vmaddr, &ptl_pte); if (ptep) { @@ -1380,7 +1381,7 @@ static int gmap_protect_rmap(struct gmap *sg, unsigned long raddr, return vmaddr; vmaddr |= paddr & ~PMD_MASK; pmdp = gmap_pmd_op_walk(parent, paddr, vmaddr, &ptl_pmd); - if (pmdp && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID)) { + if (pmdp) { if (!pmd_large(*pmdp)) { ptl_pte = NULL; ptep = gmap_pte_from_pmd(parent, pmdp, paddr, @@ -2362,8 +2363,7 @@ int gmap_shadow_segment(struct gmap *sg, unsigned long saddr, pmd_t pmd) break; } spmd = *spmdp; - if (!(pmd_val(spmd) & _SEGMENT_ENTRY_INVALID) && - !((pmd_val(spmd) & _SEGMENT_ENTRY_PROTECT) && + if (!((pmd_val(spmd) & _SEGMENT_ENTRY_PROTECT) && !(pmd_val(pmd) & _SEGMENT_ENTRY_PROTECT))) { pmd_val(*spmdp) |= _SEGMENT_ENTRY_GMAP_VSIE; @@ -2436,7 +2436,7 @@ int gmap_shadow_page(struct gmap *sg, unsigned long saddr, pte_t pte) break; rc = -EAGAIN; spmdp = gmap_pmd_op_walk(parent, paddr, vmaddr, &ptl_pmd); - if (spmdp && !(pmd_val(*spmdp) & _SEGMENT_ENTRY_INVALID)) { + if (spmdp) { /* Get page table pointer */ tptep = (pte_t *) gmap_table_walk(sg, saddr, 0); if (!tptep) { @@ -2869,9 +2869,6 @@ EXPORT_SYMBOL_GPL(gmap_pmdp_idte_global); bool gmap_test_and_clear_dirty_pmd(struct gmap *gmap, pmd_t *pmdp, unsigned long gaddr, unsigned long vmaddr) { - if (pmd_val(*pmdp) & _SEGMENT_ENTRY_INVALID) - return false; - /* Already protected memory, which did not change is clean */ if (pmd_val(*pmdp) & _SEGMENT_ENTRY_PROTECT && !(pmd_val(*pmdp) & _SEGMENT_ENTRY_GMAP_UC)) From patchwork Wed Sep 19 08:48:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Janosch Frank X-Patchwork-Id: 10605523 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F3FCF5A4 for ; Wed, 19 Sep 2018 08:49:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E03842B59F for ; Wed, 19 Sep 2018 08:49:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D277E2B5AA; Wed, 19 Sep 2018 08:49:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 628D62B59F for ; Wed, 19 Sep 2018 08:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731194AbeISO0j (ORCPT ); Wed, 19 Sep 2018 10:26:39 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:54730 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731134AbeISO0j (ORCPT ); Wed, 19 Sep 2018 10:26:39 -0400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8J8neOQ096891 for ; Wed, 19 Sep 2018 04:49:45 -0400 Received: from e06smtp07.uk.ibm.com (e06smtp07.uk.ibm.com [195.75.94.103]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mkhra41sg-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Sep 2018 04:49:42 -0400 Received: from localhost by e06smtp07.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Sep 2018 09:49:23 +0100 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp07.uk.ibm.com (192.168.101.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Sep 2018 09:49:19 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8J8nJrY65339564 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Sep 2018 08:49:19 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 62441AE045; Wed, 19 Sep 2018 11:48:26 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A8090AE051; Wed, 19 Sep 2018 11:48:25 +0100 (BST) Received: from s38lp20.boeblingen.de.ibm.com (unknown [9.145.184.145]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 19 Sep 2018 11:48:25 +0100 (BST) From: Janosch Frank To: kvm@vger.kernel.org Cc: linux-s390@vger.kernel.org, david@redhat.com, borntraeger@de.ibm.com, schwidefsky@de.ibm.com Subject: [RFC 14/14] KVM: s390: Allow the VSIE to be used with huge pages Date: Wed, 19 Sep 2018 10:48:02 +0200 X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180919084802.183381-1-frankja@linux.ibm.com> References: <20180919084802.183381-1-frankja@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091908-0028-0000-0000-000002FB67AD X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091908-0029-0000-0000-000023B52288 Message-Id: <20180919084802.183381-15-frankja@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-19_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=807 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809190092 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Now that we have VSIE support for VMs with huge memory backing, let's make both features usable at the same time. Signed-off-by: Janosch Frank --- Documentation/virtual/kvm/api.txt | 7 +++---- arch/s390/kvm/kvm-s390.c | 14 ++------------ arch/s390/mm/gmap.c | 1 - 3 files changed, 5 insertions(+), 17 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 8d8a372c8340..9ec6d23f41d5 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -4509,15 +4509,14 @@ Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits. Architectures: s390 Parameters: none -Returns: 0 on success, -EINVAL if hpage module parameter was not set - or cmma is enabled, or the VM has the KVM_VM_S390_UCONTROL +Returns: 0 on success, -EINVAL if cmma is enabled, or the VM has the KVM_VM_S390_UCONTROL flag set With this capability the KVM support for memory backing with 1m pages through hugetlbfs can be enabled for a VM. After the capability is enabled, cmma can't be enabled anymore and pfmfi and the storage key -interpretation are disabled. If cmma has already been enabled or the -hpage module parameter is not set to 1, -EINVAL is returned. +interpretation are disabled. If cmma has already been enabled, -EINVAL +is returned. While it is generally possible to create a huge page backed VM without this capability, the VM will not be able to run. diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index ac5da6b0b862..59f53b7c72d6 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -172,11 +172,6 @@ static int nested; module_param(nested, int, S_IRUGO); MODULE_PARM_DESC(nested, "Nested virtualization support"); -/* allow 1m huge page guest backing, if !nested */ -static int hpage; -module_param(hpage, int, 0444); -MODULE_PARM_DESC(hpage, "1m huge page backing support"); - /* * For now we handle at most 16 double words as this is what the s390 base * kernel handles and stores in the prefix page. If we ever need to go beyond @@ -481,7 +476,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) break; case KVM_CAP_S390_HPAGE_1M: r = 0; - if (hpage && !kvm_is_ucontrol(kvm)) + if (!kvm_is_ucontrol(kvm)) r = 1; break; case KVM_CAP_S390_MEM_OP: @@ -691,7 +686,7 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) mutex_lock(&kvm->lock); if (kvm->created_vcpus) r = -EBUSY; - else if (!hpage || kvm->arch.use_cmma || kvm_is_ucontrol(kvm)) + else if (kvm->arch.use_cmma || kvm_is_ucontrol(kvm)) r = -EINVAL; else { r = 0; @@ -4196,11 +4191,6 @@ static int __init kvm_s390_init(void) return -ENODEV; } - if (nested && hpage) { - pr_info("nested (vSIE) and hpage (huge page backing) can currently not be activated concurrently"); - return -EINVAL; - } - for (i = 0; i < 16; i++) kvm_s390_fac_base[i] |= S390_lowcore.stfle_fac_list[i] & nonhyp_mask(i); diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 8d5ce51637eb..928cb5818a21 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -1830,7 +1830,6 @@ struct gmap *gmap_shadow(struct gmap *parent, unsigned long asce, unsigned long limit; int rc; - BUG_ON(parent->mm->context.allow_gmap_hpage_1m); BUG_ON(gmap_is_shadow(parent)); spin_lock(&parent->shadow_lock); sg = gmap_find_shadow(parent, asce, edat_level);