From patchwork Thu Feb 22 23:16:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elliot Berman X-Patchwork-Id: 13568242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E36EC54798 for ; Thu, 22 Feb 2024 23:17:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0362B6B009E; Thu, 22 Feb 2024 18:16:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F1FEC6B00A2; Thu, 22 Feb 2024 18:16:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9FA16B009F; Thu, 22 Feb 2024 18:16:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9FC856B00A0 for ; Thu, 22 Feb 2024 18:16:51 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5FE1140FC5 for ; Thu, 22 Feb 2024 23:16:51 +0000 (UTC) X-FDA: 81821001822.20.0524B1F Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by imf30.hostedemail.com (Postfix) with ESMTP id 269EE80008 for ; Thu, 22 Feb 2024 23:16:48 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=kYZ0NPcm; spf=pass (imf30.hostedemail.com: domain of quic_eberman@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_eberman@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708643809; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=clAk92Ts/gD4QHs5sYWqn8ISRpZoQmyjiENZJzV8DqQ=; b=3I2jGuALvEcxZhb1KjmEQafvCaBDsGH19OHZZ+9q8zQ0+H53j5s/PFukxbMsoFt4IfO7zQ ZfknKWWvuOtAvAonmFYHg214gB0Nr2Jvy/kOQ8ZZnhJJ8XYjPrJ9zafNmFmuE1vOXMyH2I AG85tNFdCEQMzX2I/odU8C5iu7gkV9E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708643809; a=rsa-sha256; cv=none; b=qzwGTELIcCXhdQfbstiLvA3NzXsN4a/yRAQuYVLvAtdG9OJ2s8dDYUfeckvzS1dAigjaJh F6VK7jr3lkmA0wmlEk1CZudggFOYihZa9ORyabjHNcmhtl1nQ087NDI4aPZ3BCnQ3gBRnr QKQmL1quDSu4W8GZT9rtk7bZQjvx8rQ= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=kYZ0NPcm; spf=pass (imf30.hostedemail.com: domain of quic_eberman@quicinc.com designates 205.220.168.131 as permitted sender) smtp.mailfrom=quic_eberman@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com Received: from pps.filterd (m0279864.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 41MMmLNi024971; Thu, 22 Feb 2024 23:16:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= from:date:subject:mime-version:content-type :content-transfer-encoding:message-id:references:in-reply-to:to :cc; s=qcppdkim1; bh=clAk92Ts/gD4QHs5sYWqn8ISRpZoQmyjiENZJzV8DqQ =; b=kYZ0NPcmaA3IQmPoyh/tGEySC/EtYDT3lp4a/ED4uDGesLmo6/KxNex9ntd b0K2O5PQUlNnwoYocKSu36PgeUlCIPDzYLPxEnKJv444+ZI9Q1t4swhI35UJkiqY azvtDZKYjndNDrAtI61XUnPC8NiPlTruu4OXbX9WusMAySigUlDJ1OqkhR1+q4V+ kB4m794ZMuGQoG0WWoh2nTpMg1KvX+RYtOu9xxQ6JEDPrq1uBHYPh3Kuw1aPLA+R cRp9X7n5iPMB52Va7PojEMDUqRowLyiWSHRbHhXtKNP3kVzxmXpN74AT3p0FzDDJ cFLN3xtl65mCWTbeYBtSI7rwtiQ== Received: from nasanppmta04.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3we2bxas2q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 22 Feb 2024 23:16:39 +0000 (GMT) Received: from nasanex01b.na.qualcomm.com (nasanex01b.na.qualcomm.com [10.46.141.250]) by NASANPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 41MNGcNX018249 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 22 Feb 2024 23:16:38 GMT Received: from hu-eberman-lv.qualcomm.com (10.49.16.6) by nasanex01b.na.qualcomm.com (10.46.141.250) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Thu, 22 Feb 2024 15:16:38 -0800 From: Elliot Berman Date: Thu, 22 Feb 2024 15:16:43 -0800 Subject: [PATCH v17 20/35] virt: gunyah: Add interfaces to map memory into guest address space MIME-Version: 1.0 Message-ID: <20240222-gunyah-v17-20-1e9da6763d38@quicinc.com> References: <20240222-gunyah-v17-0-1e9da6763d38@quicinc.com> In-Reply-To: <20240222-gunyah-v17-0-1e9da6763d38@quicinc.com> To: Alex Elder , Srinivas Kandagatla , Murali Nalajal , Trilok Soni , Srivatsa Vaddagiri , Carl van Schaik , Philip Derrin , Prakruthi Deepak Heragu , Jonathan Corbet , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Catalin Marinas , Will Deacon , Konrad Dybcio , Bjorn Andersson , Dmitry Baryshkov , "Fuad Tabba" , Sean Christopherson , "Andrew Morton" CC: , , , , , , Elliot Berman X-Mailer: b4 0.12.4 X-Originating-IP: [10.49.16.6] X-ClientProxiedBy: nalasex01c.na.qualcomm.com (10.47.97.35) To nasanex01b.na.qualcomm.com (10.46.141.250) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: sFEbr7ut23zkwdVo9ocPmS2RpLrKc_NX X-Proofpoint-GUID: sFEbr7ut23zkwdVo9ocPmS2RpLrKc_NX X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-22_15,2024-02-22_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 suspectscore=0 priorityscore=1501 mlxscore=0 adultscore=0 phishscore=0 clxscore=1015 malwarescore=0 spamscore=0 mlxlogscore=999 lowpriorityscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2402120000 definitions=main-2402220179 X-Rspamd-Queue-Id: 269EE80008 X-Rspam-User: X-Stat-Signature: 7qwmajthdeqcdhaxcq3t3qyr969gyy1d X-Rspamd-Server: rspam03 X-HE-Tag: 1708643808-426951 X-HE-Meta: U2FsdGVkX19GaF2GFKePBAawna/pHhpOpQFDMuPpYhvkGmdwSrBsMnOKbG2EI1Fgwtj+pNXsw7WvvVw0wUZvCL6m/mgnOBuB25Ins9fuEsV460x3VXvz8NNwTP1L+OVaMqDo9vydk8h6XFrcoz4u4qxmvaUah0wppvKWOY6fi6spm1wqxEQe8w8Ibm6cHDmt1RvqQHFh+CekBnuhLHLdpj8uKQvVfOOrALAaqDgjwkaycpUkLZ+KTMBeN8aze+i/e4ZY8GSjVEAbGS9iZW9R6VFK7qluWpGiIkPj/5NcrD6hUl0F/HczBII2jWrZPooYkZeH2qUO9h92DFINxFOtLmALPpAPAEfAjnVT6IblUNbszM+eNS3XSHY+RtcwKJJQ3A5niq2322HtfRRtScRdHYHmJy9CyJnm7hnKJTglY/hvwDCVUmMa1PLnMLQA2MtzruNV3kigSNlXJPSo6MMWV2UmD9x8SrW8Qa8flvFEF0NG7y469WssVBbKNzSg/fTNk+VRJ4bGpfcocWTqqrID2vG80gPaJ7fFcRU5xekgbXs+9xY+uuk+BEVnjDlNbgilk12tquPpDlbHAQ9CzCSLEn15nnDovQRi0PSDR7QgcdG6zyi5Rxoqi8FTeD7DLLswDioMHdNTHpVcXB3WOvXjZGCf9YSg62kSrwZ+GD9krhTIYYS1+Mrlw1O+FRfnTvUbk1T66uxtvFzspqh/VWOmPwPprc/NRvPzFrqu0r8Z2zhIwI/8lxab0oVZjDcPff38Rp/CTEQYdp8A2Vwfuxi49jwcg614Uu9gxxXq7QKcVPu6tTqUGZ2i5fBHXJXQbVFB4C38jD47FE9YRZFz4zjplNORFp+S1n2nKyhZMvIGt5pL9Trq0bW9K9g1/5zNZKwRPRNufg17tE4qZ0yOpx5M9htNe8nt8uWDcimcQuDYDP5Rw+bTPonhfWh70CgeuK4FitbFTzCzty6KI9bfRcD lCBOL4ld aC6QFM8j1EaBrv8qm/GC8wqN6Qp+QsXdAoNaliL8ah/q+qBYVvTWRLh2hSyrESKxDHIuKdm9qcb0di/d5iHVbUVZmnZxQ6mxyPuzND1e90M6YkIEGXhDusklz1aoac6mQLCOAp1oM1MNWI+WotE7b4LMpes97bv99gRp1QJmwleZq2yWKzKZrznMgikJXttNo1ypRgJCr3E0fBbUZ4X5+pxOloJULfdfq6fiHUJNl6hEE3aAc/Us1k0mUpu0bEUU4+6KWjP0ifkL4M5SxgixTh3tN7T9QRdUVCnaOv+Il184k6/nx8WWCN+PM/PblaF+BQ+gz7laxYjZRQhldC9kX1Eov5dyXegGeVynweKtrjq95TU8jqaqci68jQWCifQscicR75TX8gBUxU1BMY7j+xx6Bug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Gunyah virtual machines are created with either all memory provided at VM creation using the Resource Manager memory parcel construct, or Incrementally by enabling VM demand paging. The Gunyah demand paging support is provided directly by the hypervisor and does not require the creation of resource manager memory parcels. Demand paging allows the host to map/unmap contiguous pages (folios) to a Gunyah memory extent object with the correct rights allowing its contained pages to be mapped into the Guest VM's address space. Memory extents are Gunyah's mechanism for handling system memory abstracting from the direct use of physical page numbers. Memory extents are hypervisor objects and are therefore referenced and access controlled with capabilities. When a virtual machine is configured for demand paging, 3 memory extent and 1 address space capabilities are provided to the host. The resource manager defined policy is such that memory in the "host-only" extent (the default) is private to the host. Memory in the "guest-only" extent can be used for guest private mappings, and are unmapped from the host. Memory in the "host-and-guest-shared" extent can be mapped concurrently and shared between the host and guest VMs. Implement two functions which Linux can use to move memory between the virtual machines: gunyah_provide_folio and gunyah_reclaim_folio. Memory that has been provided to the guest is tracked in a maple tree to be reclaimed later. Folios provided to the virtual machine are assumed to be owned Gunyah stack: the folio's ->private field is used for bookkeeping about whether page is mapped into virtual machine. Signed-off-by: Elliot Berman --- drivers/virt/gunyah/Makefile | 2 +- drivers/virt/gunyah/vm_mgr.c | 69 +++++++++ drivers/virt/gunyah/vm_mgr.h | 108 +++++++++++++ drivers/virt/gunyah/vm_mgr_mem.c | 321 +++++++++++++++++++++++++++++++++++++++ 4 files changed, 499 insertions(+), 1 deletion(-) diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile index 3f82af8c5ce79..f3c9507224eeb 100644 --- a/drivers/virt/gunyah/Makefile +++ b/drivers/virt/gunyah/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mem.o obj-$(CONFIG_GUNYAH) += gunyah.o gunyah_rsc_mgr.o gunyah_vcpu.o diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c index a6e25901dcea3..64967a8b72885 100644 --- a/drivers/virt/gunyah/vm_mgr.c +++ b/drivers/virt/gunyah/vm_mgr.c @@ -17,6 +17,16 @@ #include "rsc_mgr.h" #include "vm_mgr.h" +#define GUNYAH_VM_ADDRSPACE_LABEL 0 +// "To" extent for memory private to guest +#define GUNYAH_VM_MEM_EXTENT_GUEST_PRIVATE_LABEL 0 +// "From" extent for memory shared with guest +#define GUNYAH_VM_MEM_EXTENT_HOST_SHARED_LABEL 1 +// "To" extent for memory shared with the guest +#define GUNYAH_VM_MEM_EXTENT_GUEST_SHARED_LABEL 3 +// "From" extent for memory private to guest +#define GUNYAH_VM_MEM_EXTENT_HOST_PRIVATE_LABEL 2 + static DEFINE_XARRAY(gunyah_vm_functions); static void gunyah_vm_put_function(struct gunyah_vm_function *fn) @@ -175,6 +185,16 @@ void gunyah_vm_function_unregister(struct gunyah_vm_function *fn) } EXPORT_SYMBOL_GPL(gunyah_vm_function_unregister); +static bool gunyah_vm_resource_ticket_populate_noop( + struct gunyah_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc) +{ + return true; +} +static void gunyah_vm_resource_ticket_unpopulate_noop( + struct gunyah_vm_resource_ticket *ticket, struct gunyah_resource *ghrsc) +{ +} + int gunyah_vm_add_resource_ticket(struct gunyah_vm *ghvm, struct gunyah_vm_resource_ticket *ticket) { @@ -349,6 +369,17 @@ static void gunyah_vm_stop(struct gunyah_vm *ghvm) ghvm->vm_status != GUNYAH_RM_VM_STATUS_RUNNING); } +static inline void setup_extent_ticket(struct gunyah_vm *ghvm, + struct gunyah_vm_resource_ticket *ticket, + u32 label) +{ + ticket->resource_type = GUNYAH_RESOURCE_TYPE_MEM_EXTENT; + ticket->label = label; + ticket->populate = gunyah_vm_resource_ticket_populate_noop; + ticket->unpopulate = gunyah_vm_resource_ticket_unpopulate_noop; + gunyah_vm_add_resource_ticket(ghvm, ticket); +} + static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm) { struct gunyah_vm *ghvm; @@ -372,6 +403,25 @@ static __must_check struct gunyah_vm *gunyah_vm_alloc(struct gunyah_rm *rm) INIT_LIST_HEAD(&ghvm->resources); INIT_LIST_HEAD(&ghvm->resource_tickets); + mt_init(&ghvm->mm); + + ghvm->addrspace_ticket.resource_type = GUNYAH_RESOURCE_TYPE_ADDR_SPACE; + ghvm->addrspace_ticket.label = GUNYAH_VM_ADDRSPACE_LABEL; + ghvm->addrspace_ticket.populate = + gunyah_vm_resource_ticket_populate_noop; + ghvm->addrspace_ticket.unpopulate = + gunyah_vm_resource_ticket_unpopulate_noop; + gunyah_vm_add_resource_ticket(ghvm, &ghvm->addrspace_ticket); + + setup_extent_ticket(ghvm, &ghvm->host_private_extent_ticket, + GUNYAH_VM_MEM_EXTENT_HOST_PRIVATE_LABEL); + setup_extent_ticket(ghvm, &ghvm->host_shared_extent_ticket, + GUNYAH_VM_MEM_EXTENT_HOST_SHARED_LABEL); + setup_extent_ticket(ghvm, &ghvm->guest_private_extent_ticket, + GUNYAH_VM_MEM_EXTENT_GUEST_PRIVATE_LABEL); + setup_extent_ticket(ghvm, &ghvm->guest_shared_extent_ticket, + GUNYAH_VM_MEM_EXTENT_GUEST_SHARED_LABEL); + return ghvm; } @@ -533,6 +583,23 @@ static void _gunyah_vm_put(struct kref *kref) gunyah_vm_stop(ghvm); gunyah_vm_remove_functions(ghvm); + + /** + * If this fails, we're going to lose the memory for good and is + * BUG_ON-worthy, but not unrecoverable (we just lose memory). + * This call should always succeed though because the VM is in not + * running and RM will let us reclaim all the memory. + */ + WARN_ON(gunyah_vm_reclaim_range(ghvm, 0, U64_MAX)); + + /* clang-format off */ + gunyah_vm_remove_resource_ticket(ghvm, &ghvm->addrspace_ticket); + gunyah_vm_remove_resource_ticket(ghvm, &ghvm->host_shared_extent_ticket); + gunyah_vm_remove_resource_ticket(ghvm, &ghvm->host_private_extent_ticket); + gunyah_vm_remove_resource_ticket(ghvm, &ghvm->guest_shared_extent_ticket); + gunyah_vm_remove_resource_ticket(ghvm, &ghvm->guest_private_extent_ticket); + /* clang-format on */ + gunyah_vm_clean_resources(ghvm); if (ghvm->vm_status == GUNYAH_RM_VM_STATUS_EXITED || @@ -548,6 +615,8 @@ static void _gunyah_vm_put(struct kref *kref) /* clang-format on */ } + mtree_destroy(&ghvm->mm); + if (ghvm->vm_status > GUNYAH_RM_VM_STATUS_NO_STATE) { gunyah_rm_notifier_unregister(ghvm->rm, &ghvm->nb); diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h index 8c5b94101b2cf..a6a4efa4138b7 100644 --- a/drivers/virt/gunyah/vm_mgr.h +++ b/drivers/virt/gunyah/vm_mgr.h @@ -8,20 +8,53 @@ #include #include +#include #include +#include #include +#include #include #include #include "rsc_mgr.h" +static inline u64 gunyah_gpa_to_gfn(u64 gpa) +{ + return gpa >> PAGE_SHIFT; +} + +static inline u64 gunyah_gfn_to_gpa(u64 gfn) +{ + return gfn << PAGE_SHIFT; +} + long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd, unsigned long arg); /** * struct gunyah_vm - Main representation of a Gunyah Virtual machine * @vmid: Gunyah's VMID for this virtual machine + * @mm: A maple tree of all memory that has been mapped to a VM. + * Indices are guest frame numbers; entries are either folios or + * RM mem parcels + * @addrspace_ticket: Resource ticket to the capability for guest VM's + * address space + * @host_private_extent_ticket: Resource ticket to the capability for our + * memory extent from which to lend private + * memory to the guest + * @host_shared_extent_ticket: Resource ticket to the capaiblity for our + * memory extent from which to share memory + * with the guest. Distinction with + * @host_private_extent_ticket needed for + * current Qualcomm platforms; on non-Qualcomm + * platforms, this is the same capability ID + * @guest_private_extent_ticket: Resource ticket to the capaiblity for + * the guest's memory extent to lend private + * memory to + * @guest_shared_extent_ticket: Resource ticket to the capability for + * the memory extent that represents + * memory shared with the guest. * @rm: Pointer to the resource manager struct to make RM calls * @parent: For logging * @nb: Notifier block for RM notifications @@ -43,6 +76,11 @@ long gunyah_dev_vm_mgr_ioctl(struct gunyah_rm *rm, unsigned int cmd, */ struct gunyah_vm { u16 vmid; + struct maple_tree mm; + struct gunyah_vm_resource_ticket addrspace_ticket, + host_private_extent_ticket, host_shared_extent_ticket, + guest_private_extent_ticket, guest_shared_extent_ticket; + struct gunyah_rm *rm; struct notifier_block nb; @@ -63,4 +101,74 @@ struct gunyah_vm { }; +/** + * folio_mmapped() - Returns true if the folio is mapped into any vma + * @folio: Folio to test + */ +static bool folio_mmapped(struct folio *folio) +{ + struct address_space *mapping = folio->mapping; + struct vm_area_struct *vma; + bool ret = false; + + i_mmap_lock_read(mapping); + vma_interval_tree_foreach(vma, &mapping->i_mmap, folio_index(folio), + folio_index(folio) + folio_nr_pages(folio)) { + ret = true; + break; + } + i_mmap_unlock_read(mapping); + return ret; +} + +/** + * gunyah_folio_lend_safe() - Returns true if folio is ready to be lent to guest + * @folio: Folio to prepare + * + * Tests if the folio is mapped anywhere outside the kernel logical map + * and whether any userspace has a vma containing the folio, even if it hasn't + * paged it in. We want to avoid causing fault to userspace. + * If userspace doesn't have it mapped anywhere, then unmap from kernel + * logical map to prevent accidental access (e.g. by load_unaligned_zeropad) + */ +static inline bool gunyah_folio_lend_safe(struct folio *folio) +{ + long i; + + if (folio_mapped(folio) || folio_mmapped(folio)) + return false; + + for (i = 0; i < folio_nr_pages(folio); i++) + set_direct_map_invalid_noflush(folio_page(folio, i)); + /** + * No need to flush tlb on armv8/9: hypervisor will flush when it + * removes from our stage 2 + */ + return true; +} + +/** + * gunyah_folio_host_reclaim() - Restores kernel logical map to folio + * @folio: folio to reclaim by host + * + * See also gunyah_folio_lend_safe(). + */ +static inline void gunyah_folio_host_reclaim(struct folio *folio) +{ + long i; + for (i = 0; i < folio_nr_pages(folio); i++) + set_direct_map_default_noflush(folio_page(folio, i)); +} + +int gunyah_vm_parcel_to_paged(struct gunyah_vm *ghvm, + struct gunyah_rm_mem_parcel *parcel, u64 gfn, + u64 nr); +void gunyah_vm_mm_erase_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr); +int gunyah_vm_reclaim_parcel(struct gunyah_vm *ghvm, + struct gunyah_rm_mem_parcel *parcel, u64 gfn); +int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio, + u64 gfn, bool share, bool write); +int gunyah_vm_reclaim_folio(struct gunyah_vm *ghvm, u64 gfn, struct folio *folio); +int gunyah_vm_reclaim_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr); + #endif diff --git a/drivers/virt/gunyah/vm_mgr_mem.c b/drivers/virt/gunyah/vm_mgr_mem.c new file mode 100644 index 0000000000000..bcc84473004be --- /dev/null +++ b/drivers/virt/gunyah/vm_mgr_mem.c @@ -0,0 +1,321 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023-2024 Qualcomm Innovation Center, Inc. All rights reserved. + */ + +#define pr_fmt(fmt) "gunyah_vm_mgr: " fmt + +#include +#include +#include + +#include "vm_mgr.h" + +#define WRITE_TAG (1 << 0) +#define SHARE_TAG (1 << 1) + +static inline struct gunyah_resource * +__first_resource(struct gunyah_vm_resource_ticket *ticket) +{ + return list_first_entry_or_null(&ticket->resources, + struct gunyah_resource, list); +} + +int gunyah_vm_parcel_to_paged(struct gunyah_vm *ghvm, + struct gunyah_rm_mem_parcel *parcel, u64 gfn, + u64 nr) +{ + struct gunyah_rm_mem_entry *entry; + unsigned long i, tag = 0; + struct folio *folio; + pgoff_t off = 0; + int ret; + + if (parcel->n_acl_entries > 1) + tag |= SHARE_TAG; + if (parcel->acl_entries[0].perms & GUNYAH_RM_ACL_W) + tag |= WRITE_TAG; + + for (i = 0; i < parcel->n_mem_entries; i++) { + entry = &parcel->mem_entries[i]; + + folio = pfn_folio(PHYS_PFN(le64_to_cpu(entry->phys_addr))); + ret = mtree_insert_range(&ghvm->mm, gfn + off, + gfn + off + folio_nr_pages(folio) - 1, + xa_tag_pointer(folio, tag), + GFP_KERNEL); + if (ret) { + WARN_ON(ret != -ENOMEM); + gunyah_vm_mm_erase_range(ghvm, gfn, off - 1); + return ret; + } + off += folio_nr_pages(folio); + } + + return 0; +} + +/** + * gunyah_vm_mm_erase_range() - Erases a range of folios from ghvm's mm + * @ghvm: gunyah vm + * @gfn: start guest frame number + * @nr: number of pages to erase + * + * Do not use this function unless rolling back gunyah_vm_parcel_to_paged. + */ +void gunyah_vm_mm_erase_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr) +{ + struct folio *folio; + u64 off = gfn; + + while (off < gfn + nr) { + folio = xa_untag_pointer(mtree_erase(&ghvm->mm, off)); + if (!folio) + return; + off += folio_nr_pages(folio); + } +} + +static inline u32 donate_flags(bool share) +{ + if (share) + return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK, + GUNYAH_MEMEXTENT_DONATE_TO_SIBLING); + else + return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK, + GUNYAH_MEMEXTENT_DONATE_TO_PROTECTED); +} + +static inline u32 reclaim_flags(bool share) +{ + if (share) + return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK, + GUNYAH_MEMEXTENT_DONATE_TO_SIBLING); + else + return FIELD_PREP_CONST(GUNYAH_MEMEXTENT_OPTION_TYPE_MASK, + GUNYAH_MEMEXTENT_DONATE_FROM_PROTECTED); +} + +int gunyah_vm_provide_folio(struct gunyah_vm *ghvm, struct folio *folio, + u64 gfn, bool share, bool write) +{ + struct gunyah_resource *guest_extent, *host_extent, *addrspace; + u32 map_flags = BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PARTIAL); + u64 extent_attrs, gpa = gunyah_gfn_to_gpa(gfn); + phys_addr_t pa = PFN_PHYS(folio_pfn(folio)); + enum gunyah_pagetable_access access; + size_t size = folio_size(folio); + enum gunyah_error gunyah_error; + unsigned long tag = 0; + int ret; + + /* clang-format off */ + if (share) { + guest_extent = __first_resource(&ghvm->guest_shared_extent_ticket); + host_extent = __first_resource(&ghvm->host_shared_extent_ticket); + } else { + guest_extent = __first_resource(&ghvm->guest_private_extent_ticket); + host_extent = __first_resource(&ghvm->host_private_extent_ticket); + } + /* clang-format on */ + addrspace = __first_resource(&ghvm->addrspace_ticket); + + if (!addrspace || !guest_extent || !host_extent) { + folio_unlock(folio); + return -ENODEV; + } + + if (share) { + map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_VMMIO); + tag |= SHARE_TAG; + } else { + map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PRIVATE); + } + + if (write) + tag |= WRITE_TAG; + + ret = mtree_insert_range(&ghvm->mm, gfn, + gfn + folio_nr_pages(folio) - 1, + xa_tag_pointer(folio, tag), GFP_KERNEL); + if (ret == -EEXIST) + ret = -EAGAIN; + if (ret) + return ret; + + /* don't lend a folio that is (or could be) mapped by Linux */ + if (!share && !gunyah_folio_lend_safe(folio)) { + ret = -EPERM; + goto remove; + } + + if (share && write) + access = GUNYAH_PAGETABLE_ACCESS_RW; + else if (share && !write) + access = GUNYAH_PAGETABLE_ACCESS_R; + else if (!share && write) + access = GUNYAH_PAGETABLE_ACCESS_RWX; + else /* !share && !write */ + access = GUNYAH_PAGETABLE_ACCESS_RX; + + gunyah_error = gunyah_hypercall_memextent_donate(donate_flags(share), + host_extent->capid, + guest_extent->capid, + pa, size); + if (gunyah_error != GUNYAH_ERROR_OK) { + pr_err("Failed to donate memory for guest address 0x%016llx: %d\n", + gpa, gunyah_error); + ret = gunyah_error_remap(gunyah_error); + goto remove; + } + + extent_attrs = + FIELD_PREP_CONST(GUNYAH_MEMEXTENT_MAPPING_TYPE, + ARCH_GUNYAH_DEFAULT_MEMTYPE) | + FIELD_PREP(GUNYAH_MEMEXTENT_MAPPING_USER_ACCESS, access) | + FIELD_PREP(GUNYAH_MEMEXTENT_MAPPING_KERNEL_ACCESS, access); + gunyah_error = gunyah_hypercall_addrspace_map(addrspace->capid, + guest_extent->capid, gpa, + extent_attrs, map_flags, + pa, size); + if (gunyah_error != GUNYAH_ERROR_OK) { + pr_err("Failed to map guest address 0x%016llx: %d\n", gpa, + gunyah_error); + ret = gunyah_error_remap(gunyah_error); + goto memextent_reclaim; + } + + folio_get(folio); + if (!share) + folio_set_private(folio); + return 0; +memextent_reclaim: + gunyah_error = gunyah_hypercall_memextent_donate(reclaim_flags(share), + guest_extent->capid, + host_extent->capid, pa, + size); + if (gunyah_error != GUNYAH_ERROR_OK) + pr_err("Failed to reclaim memory donation for guest address 0x%016llx: %d\n", + gpa, gunyah_error); +remove: + mtree_erase(&ghvm->mm, gfn); + return ret; +} + +static int __gunyah_vm_reclaim_folio_locked(struct gunyah_vm *ghvm, void *entry, + u64 gfn, const bool sync) +{ + u32 map_flags = BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PARTIAL); + struct gunyah_resource *guest_extent, *host_extent, *addrspace; + enum gunyah_error gunyah_error; + struct folio *folio; + bool write, share; + phys_addr_t pa; + size_t size; + int ret; + + addrspace = __first_resource(&ghvm->addrspace_ticket); + if (!addrspace) + return -ENODEV; + + share = !!(xa_pointer_tag(entry) & SHARE_TAG); + write = !!(xa_pointer_tag(entry) & WRITE_TAG); + folio = xa_untag_pointer(entry); + + if (!sync) + map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_NOSYNC); + + /* clang-format off */ + if (share) { + guest_extent = __first_resource(&ghvm->guest_shared_extent_ticket); + host_extent = __first_resource(&ghvm->host_shared_extent_ticket); + map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_VMMIO); + } else { + guest_extent = __first_resource(&ghvm->guest_private_extent_ticket); + host_extent = __first_resource(&ghvm->host_private_extent_ticket); + map_flags |= BIT(GUNYAH_ADDRSPACE_MAP_FLAG_PRIVATE); + } + /* clang-format on */ + + pa = PFN_PHYS(folio_pfn(folio)); + size = folio_size(folio); + + gunyah_error = gunyah_hypercall_addrspace_unmap(addrspace->capid, + guest_extent->capid, + gunyah_gfn_to_gpa(gfn), + map_flags, pa, size); + if (gunyah_error != GUNYAH_ERROR_OK) { + pr_err_ratelimited( + "Failed to unmap guest address 0x%016llx: %d\n", + gunyah_gfn_to_gpa(gfn), gunyah_error); + ret = gunyah_error_remap(gunyah_error); + goto err; + } + + gunyah_error = gunyah_hypercall_memextent_donate(reclaim_flags(share), + guest_extent->capid, + host_extent->capid, pa, + size); + if (gunyah_error != GUNYAH_ERROR_OK) { + pr_err_ratelimited( + "Failed to reclaim memory donation for guest address 0x%016llx: %d\n", + gunyah_gfn_to_gpa(gfn), gunyah_error); + ret = gunyah_error_remap(gunyah_error); + goto err; + } + + BUG_ON(mtree_erase(&ghvm->mm, gfn) != entry); + + if (folio_test_private(folio)) { + gunyah_folio_host_reclaim(folio); + folio_clear_private(folio); + } + + folio_put(folio); + return 0; +err: + return ret; +} + +int gunyah_vm_reclaim_folio(struct gunyah_vm *ghvm, u64 gfn, struct folio *folio) +{ + void *entry; + + entry = mtree_load(&ghvm->mm, gfn); + if (!entry) + return 0; + + if (folio != xa_untag_pointer(entry)) + return -EAGAIN; + + return __gunyah_vm_reclaim_folio_locked(ghvm, entry, gfn, true); +} + +int gunyah_vm_reclaim_range(struct gunyah_vm *ghvm, u64 gfn, u64 nr) +{ + unsigned long next = gfn, g; + struct folio *folio; + int ret, ret2 = 0; + void *entry; + bool sync; + + mt_for_each(&ghvm->mm, entry, next, gfn + nr) { + folio = xa_untag_pointer(entry); + g = next; + sync = !!mt_find_after(&ghvm->mm, &g, gfn + nr); + + g = next - folio_nr_pages(folio); + folio_get(folio); + folio_lock(folio); + if (mtree_load(&ghvm->mm, g) == entry) + ret = __gunyah_vm_reclaim_folio_locked(ghvm, entry, g, sync); + else + ret = -EAGAIN; + folio_unlock(folio); + folio_put(folio); + if (ret && ret2 != -EAGAIN) + ret2 = ret; + } + + return ret2; +}