From patchwork Wed Oct 16 16:10:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anup Patel X-Patchwork-Id: 11193699 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 78ED3912 for ; Wed, 16 Oct 2019 16:10:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4567821D7A for ; Wed, 16 Oct 2019 16:10:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="OaIkFGoN"; dkim=pass (1024-bit key) header.d=sharedspace.onmicrosoft.com header.i=@sharedspace.onmicrosoft.com header.b="iInaTKUI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405321AbfJPQKc (ORCPT ); Wed, 16 Oct 2019 12:10:32 -0400 Received: from esa4.hgst.iphmx.com ([216.71.154.42]:20334 "EHLO esa4.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405308AbfJPQKb (ORCPT ); Wed, 16 Oct 2019 12:10:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1571242230; x=1602778230; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=3yBuwoxlI9+CUdz7Q9+//RTEtW88BfbTx++c7p2/8js=; b=OaIkFGoNqQ0Uc/EEjvCID9w5H4s607c8AFJN09TS5qoq8xQWDD8uVEa5 N78LvBYnaGFtnwLUbOeLIDAaE3J89FbsY8atc/6qVCk+Fp88kvj3iVzg2 qbWNoDmr3NjmQM6iR0VMgwBkA4P2itgNtdvd3vTESq9q+y5mpY26hC6E4 Cb5R3oCLY0FZr85jMerU27ESWP+cVZp+kL/CMyM7n8bHg42U7aJE2Cg0N hM+DbPyNJ5JzqHLMa/WtiDUhHoug6AfSpzRWxxx2yfVqpr4WwJcNvjTDL se66hOAg9G2yldUs5aGAzUeRxXhyrI1Up53FNYscPY2xJotNSVUj2YUlB Q==; IronPort-SDR: l2NBdZglVfFwWIST14d31fpicgRxEC6dnbFnz0Ht8dz8DQ7ynzfwyufMonD6XttYfO2nTEe/Sa BWlI+sVb8YrxnX7XWZP8fJtyv3s4HTFye703mqQfH7yM1ECCRJnp3I03vHEfbGqcUF78DlWWT3 pqaKtVG0cGl/mORq/NgVaA00yoJgYQp3Ydya1PqYyxHM+Y0VmvjIYEOYlGHg1bEgmYc4GxDLN0 mfxkPxKIe9t4V3MtumZY/0eBoLnG4xxRqxDhTdxUUSuEiazod5PvRVEbMZD2sS4D1cnWpJc7qD oFA= X-IronPort-AV: E=Sophos;i="5.67,304,1566835200"; d="scan'208";a="120681062" Received: from mail-dm3nam03lp2057.outbound.protection.outlook.com (HELO NAM03-DM3-obe.outbound.protection.outlook.com) ([104.47.41.57]) by ob1.hgst.iphmx.com with ESMTP; 17 Oct 2019 00:10:29 +0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=depVv3HDiuj4oMjE7GMFkmDhjDsxKFkOzFNxVChSDK2p0m7zxMaDc6peqPe716ShTz1Qr8GmqsmEGCE2N8MI5D4w1nrR+l6GHblaqZ+WnpVPCaUdkU++EKMeXBIzRF0wsChvuDoh6B4CyZcJ0Z+Qo0L8DhaE5reOcaYPP/9UZUwGxQW+h3RhD2QxMkTpaDT8u3oUcaAOJzG16AYbQTAXul+gX05VmdCYYhAzddzitSsKdttVDAzCMdKS9TvnT8/n+/fp0zljI7qKvhQQlhyakb4r8i5lElOPa8pkoNBHXlJHqV3FhY2LqX6yReq9JNT2pwh8D4d/HVnVjtQFb2pFaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GMYMNsbShEmF112tSjSjSh1D/PlrG4v3cRE7rt+JVIY=; b=lc0JuQieKN9y9oixGIySMLfdId06k2Dz51IWOFRNo31c4x/oZ0rwrFS8bIx5gTILUTcSrm6v6uCIlCoawlXIOkd+AL1pci2dkehk3ZiJS7oaeIjKGJqRia66ke6G4Q9T+Xcp6dYLIAvHiJz9JFBxRrHCMyVeM5M2Fd28NbFTns3Mf66qpwVoyxRysQAaadcgFTCiMxOfzH2Efu9XvIXyfte9te9ljDFmD0st5AynLLqOiJ7IEri56RIVb+zwLwMEDi1USkidbLV2rq8K8/C8S2kDCQ2ZyZNrm4qgNP58x/Ux706vcIE9CJW9jv7aroM9dXKvUAejJqnRxcqOQwfjfw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=wdc.com; dmarc=pass action=none header.from=wdc.com; dkim=pass header.d=wdc.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sharedspace.onmicrosoft.com; s=selector2-sharedspace-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GMYMNsbShEmF112tSjSjSh1D/PlrG4v3cRE7rt+JVIY=; b=iInaTKUIK/pJd0khvM6Zs8HWWNlerKjak0NiHFvqgbyRinFmCdAX1u3i31wIRygk24Flkg4O/BVrYsb1EGneXgIMmDQFHr8nhO882T/T2uiuI58UahEAJLEuk6jYYEQg1L52zfZWbzI+/1PFAzfUe3QaXPQfej/9WxpDFfuxkG8= Received: from MN2PR04MB6061.namprd04.prod.outlook.com (20.178.246.15) by MN2PR04MB6397.namprd04.prod.outlook.com (52.132.170.135) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2347.17; Wed, 16 Oct 2019 16:10:27 +0000 Received: from MN2PR04MB6061.namprd04.prod.outlook.com ([fe80::1454:87a:13b0:d3a]) by MN2PR04MB6061.namprd04.prod.outlook.com ([fe80::1454:87a:13b0:d3a%7]) with mapi id 15.20.2347.023; Wed, 16 Oct 2019 16:10:27 +0000 From: Anup Patel To: Palmer Dabbelt , Paul Walmsley , Paolo Bonzini , Radim K CC: Daniel Lezcano , Thomas Gleixner , Alexander Graf , Atish Patra , Alistair Francis , Damien Le Moal , Christoph Hellwig , Anup Patel , "kvm@vger.kernel.org" , "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Anup Patel Subject: [PATCH v9 12/22] RISC-V: KVM: Implement MMU notifiers Thread-Topic: [PATCH v9 12/22] RISC-V: KVM: Implement MMU notifiers Thread-Index: AQHVhDw5KalS/jP3s0OPelwGiCdnSw== Date: Wed, 16 Oct 2019 16:10:27 +0000 Message-ID: <20191016160649.24622-13-anup.patel@wdc.com> References: <20191016160649.24622-1-anup.patel@wdc.com> In-Reply-To: <20191016160649.24622-1-anup.patel@wdc.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: MAXPR01CA0098.INDPRD01.PROD.OUTLOOK.COM (2603:1096:a00:5d::16) To MN2PR04MB6061.namprd04.prod.outlook.com (2603:10b6:208:d8::15) authentication-results: spf=none (sender IP is ) smtp.mailfrom=Anup.Patel@wdc.com; x-ms-exchange-messagesentrepresentingtype: 1 x-mailer: git-send-email 2.17.1 x-originating-ip: [106.51.27.162] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 613c10ff-8ee0-48e4-8c2d-08d752535be9 x-ms-office365-filtering-ht: Tenant x-ms-traffictypediagnostic: MN2PR04MB6397: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: wdcipoutbound: EOP-TRUE x-ms-oob-tlc-oobclassifiers: OLM:785; x-forefront-prvs: 0192E812EC x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(4636009)(366004)(346002)(396003)(376002)(39860400002)(136003)(199004)(189003)(71190400001)(71200400001)(66556008)(66476007)(66446008)(64756008)(66946007)(6116002)(3846002)(14444005)(66066001)(76176011)(52116002)(256004)(99286004)(25786009)(7416002)(14454004)(4326008)(478600001)(7736002)(8676002)(305945005)(110136005)(476003)(54906003)(86362001)(2616005)(11346002)(446003)(316002)(102836004)(81166006)(486006)(55236004)(44832011)(5660300002)(50226002)(26005)(9456002)(186003)(8936002)(36756003)(1076003)(6436002)(6512007)(2906002)(6486002)(386003)(81156014)(6506007);DIR:OUT;SFP:1102;SCL:1;SRVR:MN2PR04MB6397;H:MN2PR04MB6061.namprd04.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: CpEC4NeV1HdsGjtLucOH+ugTMkGV/O0HslEcVNRxIWPmkgUeagh3ASKSRZOZz6++SpHFSLTrixWn5uDa6o8Em9Qnp5Z3fvGaLO91MSFmLVBGUL2/3frgGNXGU1tG90P/1u1MGQQJjI0ffkcXgjeVgtGKY+5E2aPm9pQNe141WjHhbl/aR9zprXHB+UhL3l6jW6Nc5+7j9r0VqTsIJITzB3xX7t4NNXWze6HuasXIlh8Xz9PzvNSP3vXTaVH0XpIBKgcf9wwEi6J7KEdad8aiymYAlExHhrdgCXzalU97nFf9Wc4cfS3Uxbv7A9B+WFNFiULR1xCvG2Btm/pfPAJZXHG5DIdTgfrccbhpQwKV4g5dvQU0yZChvPfj+PcvUpuJ3JHHhraajfXsqvdOrcESKMCkyVDPw7d5DiCED3pSJMw= MIME-Version: 1.0 X-OriginatorOrg: wdc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 613c10ff-8ee0-48e4-8c2d-08d752535be9 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Oct 2019 16:10:27.5939 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b61c8803-16f3-4c35-9b17-6f65f441df86 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Bp/teajZ3Z1l6NvMlfCHX6+yN03CG89GHItX3XcHxwBL36hOtIbhu472Eb+0BrBffh69FVRwuOzeXhk9oFDL+A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR04MB6397 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This patch implements MMU notifiers for KVM RISC-V so that Guest physical address space is in-sync with Host physical address space. This will allow swapping, page migration, etc to work transparently with KVM RISC-V. Signed-off-by: Anup Patel Acked-by: Paolo Bonzini Reviewed-by: Paolo Bonzini Reviewed-by: Alexander Graf --- arch/riscv/include/asm/kvm_host.h | 7 ++ arch/riscv/kvm/Kconfig | 1 + arch/riscv/kvm/mmu.c | 200 +++++++++++++++++++++++++++++- arch/riscv/kvm/vm.c | 1 + 4 files changed, 208 insertions(+), 1 deletion(-) diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index aba3d80d4bea..b1493c6cbe83 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -193,6 +193,13 @@ static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {} int kvm_riscv_setup_vsip(void); void kvm_riscv_cleanup_vsip(void); +#define KVM_ARCH_WANT_MMU_NOTIFIER +int kvm_unmap_hva_range(struct kvm *kvm, + unsigned long start, unsigned long end); +int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte); +int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end); +int kvm_test_age_hva(struct kvm *kvm, unsigned long hva); + void __kvm_riscv_hfence_gvma_vmid_gpa(unsigned long vmid, unsigned long gpa); void __kvm_riscv_hfence_gvma_vmid(unsigned long vmid); diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig index 9cca98c4673b..d8fa13b0da18 100644 --- a/arch/riscv/kvm/Kconfig +++ b/arch/riscv/kvm/Kconfig @@ -20,6 +20,7 @@ if VIRTUALIZATION config KVM tristate "Kernel-based Virtual Machine (KVM) support (EXPERIMENTAL)" depends on OF + select MMU_NOTIFIER select PREEMPT_NOTIFIERS select ANON_INODES select KVM_MMIO diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 61a98d63d103..5aa5ea5ef8f6 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -67,6 +67,66 @@ static void *stage2_cache_alloc(struct kvm_mmu_page_cache *pcache) return p; } +static int stage2_pgdp_test_and_clear_young(pgd_t *pgd) +{ + return ptep_test_and_clear_young(NULL, 0, (pte_t *)pgd); +} + +static int stage2_pmdp_test_and_clear_young(pmd_t *pmd) +{ + return ptep_test_and_clear_young(NULL, 0, (pte_t *)pmd); +} + +static int stage2_ptep_test_and_clear_young(pte_t *pte) +{ + return ptep_test_and_clear_young(NULL, 0, pte); +} + +static bool stage2_get_leaf_entry(struct kvm *kvm, gpa_t addr, + pgd_t **pgdpp, pmd_t **pmdpp, pte_t **ptepp) +{ + pgd_t *pgdp; + pmd_t *pmdp; + pte_t *ptep; + + *pgdpp = NULL; + *pmdpp = NULL; + *ptepp = NULL; + + pgdp = &kvm->arch.pgd[pgd_index(addr)]; + if (!pgd_val(*pgdp)) + return false; + if (pgd_val(*pgdp) & _PAGE_LEAF) { + *pgdpp = pgdp; + return true; + } + + if (stage2_have_pmd) { + pmdp = (void *)pgd_page_vaddr(*pgdp); + pmdp = &pmdp[pmd_index(addr)]; + if (!pmd_present(*pmdp)) + return false; + if (pmd_val(*pmdp) & _PAGE_LEAF) { + *pmdpp = pmdp; + return true; + } + + ptep = (void *)pmd_page_vaddr(*pmdp); + } else { + ptep = (void *)pgd_page_vaddr(*pgdp); + } + + ptep = &ptep[pte_index(addr)]; + if (!pte_present(*ptep)) + return false; + if (pte_val(*ptep) & _PAGE_LEAF) { + *ptepp = ptep; + return true; + } + + return false; +} + struct local_guest_tlb_info { struct kvm_vmid *vmid; gpa_t addr; @@ -450,6 +510,38 @@ int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa, } +static int handle_hva_to_gpa(struct kvm *kvm, + unsigned long start, + unsigned long end, + int (*handler)(struct kvm *kvm, + gpa_t gpa, u64 size, + void *data), + void *data) +{ + struct kvm_memslots *slots; + struct kvm_memory_slot *memslot; + int ret = 0; + + slots = kvm_memslots(kvm); + + /* we only care about the pages that the guest sees */ + kvm_for_each_memslot(memslot, slots) { + unsigned long hva_start, hva_end; + gfn_t gpa; + + hva_start = max(start, memslot->userspace_addr); + hva_end = min(end, memslot->userspace_addr + + (memslot->npages << PAGE_SHIFT)); + if (hva_start >= hva_end) + continue; + + gpa = hva_to_gfn_memslot(hva_start, memslot) << PAGE_SHIFT; + ret |= handler(kvm, gpa, (u64)(hva_end - hva_start), data); + } + + return ret; +} + void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, struct kvm_memory_slot *dont) { @@ -582,6 +674,106 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, return ret; } +static int kvm_unmap_hva_handler(struct kvm *kvm, + gpa_t gpa, u64 size, void *data) +{ + stage2_unmap_range(kvm, gpa, size); + return 0; +} + +int kvm_unmap_hva_range(struct kvm *kvm, + unsigned long start, unsigned long end) +{ + if (!kvm->arch.pgd) + return 0; + + handle_hva_to_gpa(kvm, start, end, + &kvm_unmap_hva_handler, NULL); + return 0; +} + +static int kvm_set_spte_handler(struct kvm *kvm, + gpa_t gpa, u64 size, void *data) +{ + pte_t *pte = (pte_t *)data; + + WARN_ON(size != PAGE_SIZE); + stage2_set_pte(kvm, NULL, gpa, pte); + + return 0; +} + +int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) +{ + unsigned long end = hva + PAGE_SIZE; + kvm_pfn_t pfn = pte_pfn(pte); + pte_t stage2_pte; + + if (!kvm->arch.pgd) + return 0; + + stage2_pte = pfn_pte(pfn, PAGE_WRITE_EXEC); + handle_hva_to_gpa(kvm, hva, end, + &kvm_set_spte_handler, &stage2_pte); + + return 0; +} + +static int kvm_age_hva_handler(struct kvm *kvm, + gpa_t gpa, u64 size, void *data) +{ + pgd_t *pgd; + pmd_t *pmd; + pte_t *pte; + + WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PGDIR_SIZE); + if (!stage2_get_leaf_entry(kvm, gpa, &pgd, &pmd, &pte)) + return 0; + + if (pgd) + return stage2_pgdp_test_and_clear_young(pgd); + else if (pmd) + return stage2_pmdp_test_and_clear_young(pmd); + else + return stage2_ptep_test_and_clear_young(pte); +} + +int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end) +{ + if (!kvm->arch.pgd) + return 0; + + return handle_hva_to_gpa(kvm, start, end, kvm_age_hva_handler, NULL); +} + +static int kvm_test_age_hva_handler(struct kvm *kvm, + gpa_t gpa, u64 size, void *data) +{ + pgd_t *pgd; + pmd_t *pmd; + pte_t *pte; + + WARN_ON(size != PAGE_SIZE && size != PMD_SIZE); + if (!stage2_get_leaf_entry(kvm, gpa, &pgd, &pmd, &pte)) + return 0; + + if (pgd) + return pte_young(*((pte_t *)pgd)); + else if (pmd) + return pte_young(*((pte_t *)pmd)); + else + return pte_young(*pte); +} + +int kvm_test_age_hva(struct kvm *kvm, unsigned long hva) +{ + if (!kvm->arch.pgd) + return 0; + + return handle_hva_to_gpa(kvm, hva, hva, + kvm_test_age_hva_handler, NULL); +} + int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long hva, bool is_write) { @@ -593,7 +785,7 @@ int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long hva, struct vm_area_struct *vma; struct kvm *kvm = vcpu->kvm; struct kvm_mmu_page_cache *pcache = &vcpu->arch.mmu_page_cache; - unsigned long vma_pagesize; + unsigned long vma_pagesize, mmu_seq; down_read(¤t->mm->mmap_sem); @@ -626,6 +818,8 @@ int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long hva, return ret; } + mmu_seq = kvm->mmu_notifier_seq; + hfn = gfn_to_pfn_prot(kvm, gfn, is_write, &writeable); if (hfn == KVM_PFN_ERR_HWPOISON) { if (is_vm_hugetlb_page(vma)) @@ -644,6 +838,9 @@ int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long hva, spin_lock(&kvm->mmu_lock); + if (mmu_notifier_retry(kvm, mmu_seq)) + goto out_unlock; + if (writeable) { kvm_set_pfn_dirty(hfn); ret = stage2_map_page(kvm, pcache, gpa, hfn << PAGE_SHIFT, @@ -656,6 +853,7 @@ int kvm_riscv_stage2_map(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long hva, if (ret) kvm_err("Failed to map in stage2\n"); +out_unlock: spin_unlock(&kvm->mmu_lock); kvm_set_pfn_accessed(hfn); kvm_release_pfn_clean(hfn); diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c index c5aab5478c38..fd84b4d914dc 100644 --- a/arch/riscv/kvm/vm.c +++ b/arch/riscv/kvm/vm.c @@ -54,6 +54,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) switch (ext) { case KVM_CAP_DEVICE_CTRL: case KVM_CAP_USER_MEMORY: + case KVM_CAP_SYNC_MMU: case KVM_CAP_DESTROY_MEMORY_REGION_WORKS: case KVM_CAP_ONE_REG: case KVM_CAP_READONLY_MEM: