From patchwork Thu Apr 4 18:50:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618146 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6705F134436 for ; Thu, 4 Apr 2024 18:50:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256645; cv=none; b=nBFqzFZ06mOWMq/fXv0aX1hk+mXh0jR30KdRlYUtH3vBZXWhFCNTaNUdXr/R2SF70aPoTiNzQ4JVexI8QMmW1RZG4gHTEyzxvElqhQSkVnrLNNqbwacoRh3SzuuYG8n0JO4568L3fMt/WqWUIKR6dPDCwJ2Q1eKwPnoJk4cp1kg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256645; c=relaxed/simple; bh=xLRvqZBxE8nyyslPqQaHeAePgGUJ5/dbiqlvXpYDL9c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TQrVHDX7y7iAzpKqer729AJ3DODaUmgS8npqpXnbOyZzW00HqpsvJBhz7imrkwbjzhc2euQhxVeXAW3ejIfA2su30li37US4JRJGMAOcBubF7tB45jp736MC+ibhGXIOJo1IzpbZVcBjyIHpe+77S20f8cbPXw41euLg5iMu8vg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CfnOe289; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CfnOe289" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256642; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4MThrRX4woKvTXfjRRP0DuefjayuTL5Cbh2ShiuiO1E=; b=CfnOe289nTBPc1YZ4qGuB9OerjlXdfomrfr/A5bHjMTAOB8wkFu/jY6m++vV/ifxQXQpYg 58lQGNT/QomQ5QDK04mjCqsqUrN43gbTdnGmdaXUk6jb92qltW414lw84dqvFfvkFgKgzR vRbl+s25kkk36Fv7+o5XGEWHmKXW/Rs= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-12-ZHOW8xWvPXKhmDMMj-J8PA-1; Thu, 04 Apr 2024 14:50:35 -0400 X-MC-Unique: ZHOW8xWvPXKhmDMMj-J8PA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 034981C0F2E0; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBC861C060A4; Thu, 4 Apr 2024 18:50:34 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com, Matthew Wilcox Subject: [PATCH 01/11] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Date: Thu, 4 Apr 2024 14:50:23 -0400 Message-ID: <20240404185034.3184582-2-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 From: Michael Roth filemap users like guest_memfd may use page cache pages to allocate/manage memory that is only intended to be accessed by guests via hardware protections like encryption. Writes to memory of this sort in common paths like truncation may cause unexpected behavior such writing garbage instead of zeros when attempting to zero pages, or worse, triggering hardware protections that are considered fatal as far as the kernel is concerned. Introduce a new address_space flag, AS_INACCESSIBLE, and use this initially to prevent zero'ing of pages during truncation, with the understanding that it is up to the owner of the mapping to handle this specially if needed. Link: https://lore.kernel.org/lkml/ZR9LYhpxTaTk6PJX@google.com/ Cc: Matthew Wilcox Suggested-by: Sean Christopherson Signed-off-by: Michael Roth Message-ID: <20240329212444.395559-5-michael.roth@amd.com> Signed-off-by: Paolo Bonzini Acked-by: Vlastimil Babka --- include/linux/pagemap.h | 1 + mm/truncate.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 2df35e65557d..f879c1d54da7 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -207,6 +207,7 @@ enum mapping_flags { AS_STABLE_WRITES, /* must wait for writeback before modifying folio contents */ AS_UNMOVABLE, /* The mapping cannot be moved, ever */ + AS_INACCESSIBLE, /* Do not attempt direct R/W access to the mapping */ }; /** diff --git a/mm/truncate.c b/mm/truncate.c index 725b150e47ac..c501338c7ebd 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -233,7 +233,8 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) * doing a complex calculation here, and then doing the zeroing * anyway if the page split fails. */ - folio_zero_range(folio, offset, length); + if (!(folio->mapping->flags & AS_INACCESSIBLE)) + folio_zero_range(folio, offset, length); if (folio_has_private(folio)) folio_invalidate(folio, offset, length); From patchwork Thu Apr 4 18:50:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618139 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F62A131E24 for ; Thu, 4 Apr 2024 18:50:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256642; cv=none; b=l3z2+sjoTzYA9e9m0VtXcVRra+IccQVB87WGmTxJby/dFw4luf+nGDp69Lux1rqCKMiDOGyNwJ2essjBSONluUc+jqLT3x1t+JYa7fbSqbZU4PgpG+ENrauUUjLl2SP7+T1nX/1/HiY9sgmsjXnKrDAoeF4mLQW/zXuNz3gJPAE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256642; c=relaxed/simple; bh=rPyzJzYSSz48gXujx1qLV6V98R9AloP/Cjjmrkp6pHc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ofis64c7wDNgDUIz+Xwxh2lRVsXHnOPMD73am0DHSQHcjq1hPhup/8eNCtCmG4EK2NgTxeeGwm/84POzO3AGmMokTLxR05+wWNMw2rZgxFlm+0HncEsGPAUTSvdSjDR7xIfz1PVNxAFrAM0pwfVtanBhuQxZaQmfDs8gLJs6EFQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=HobvqCh4; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HobvqCh4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256639; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZJEIZUQrzsd0Y9E8i/CEuCH1bzGOQtQsL0wQj5bpcwo=; b=HobvqCh4pxQkZl5ZLzgyrjRPq1f8+RAd63ai+NPFApJBBqLhnJ2f8+4Ty4RYAc09jIg3+i BZ8SfB9yRztLoVJvfyMLuGe1m909+ZR5cQacZBEgNfkGJJOwSnEMGMVQJ42O24XRlHEEab B9ogN9JishNoWaOiu7tjF1DNyLtkeXE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-625-opckuniFNKe245UFzdHtFA-1; Thu, 04 Apr 2024 14:50:35 -0400 X-MC-Unique: opckuniFNKe245UFzdHtFA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 31A9785CE42; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0BACF1C060A4; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 02/11] KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode Date: Thu, 4 Apr 2024 14:50:24 -0400 Message-ID: <20240404185034.3184582-3-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 From: Michael Roth truncate_inode_pages_range() may attempt to zero pages before truncating them, and this will occur before arch-specific invalidations can be triggered via .invalidate_folio/.free_folio hooks via kvm_gmem_aops. For AMD SEV-SNP this would result in an RMP #PF being generated by the hardware, which is currently treated as fatal (and even if specifically allowed for, would not result in anything other than garbage being written to guest pages due to encryption). On Intel TDX this would also result in undesirable behavior. Set the AS_INACCESSIBLE flag to prevent the MM from attempting unexpected accesses of this sort during operations like truncation. This may also in some cases yield a decent performance improvement for guest_memfd userspace implementations that hole-punch ranges immediately after private->shared conversions via KVM_SET_MEMORY_ATTRIBUTES, since the current implementation of truncate_inode_pages_range() always ends up zero'ing an entire 4K range if it is backing by a 2M folio. Link: https://lore.kernel.org/lkml/ZR9LYhpxTaTk6PJX@google.com/ Suggested-by: Sean Christopherson Signed-off-by: Michael Roth Message-ID: <20240329212444.395559-6-michael.roth@amd.com> Signed-off-by: Paolo Bonzini Acked-by: Vlastimil Babka --- virt/kvm/guest_memfd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 0f4e0cf4f158..5a929536ecf2 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -357,6 +357,7 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) inode->i_private = (void *)(unsigned long)flags; inode->i_op = &kvm_gmem_iops; inode->i_mapping->a_ops = &kvm_gmem_aops; + inode->i_mapping->flags |= AS_INACCESSIBLE; inode->i_mode |= S_IFREG; inode->i_size = size; mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); From patchwork Thu Apr 4 18:50:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618138 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5213E131BDB for ; Thu, 4 Apr 2024 18:50:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256642; cv=none; b=HRCGHTse9ZznQq6Y2VEFPqNMpM20JAvwiCCiJu7YuafpAlSMBcWg0O09ALYNw99RvU9vx81phrftEp0FCq0YJQr3L9v8EUWPZPrC0MCItV1SRpibJb5BG3B43IYonTMzfvMQY8KnPLi+0IY3s16+lfBx0wXRphJEpXnWrY7ze18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256642; c=relaxed/simple; bh=F9Opcs5Sfjt55zjO6/Xlftcpp0gbWr27hsN6hC6zdq0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=oIr3Wnnbk+KvMioUtPP1a46Z/9fIeFCj9nt4E0KmS2Oz2kdSiPjinFqFvrYNn73jwPz/E5ZEZHcP+rA5adoh7LL0hngatG2vIFv0VmP5/3sDnh1YO9atfKvfK+ELN/5TQDIbZTIFSvHePZkTabGEfJ+KgvzHOHSP1pdNsfhcIdk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CGwbto6/; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CGwbto6/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256639; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IGNOj1ZA+U6rAjLuPpugLuw9vdunQ/dJ6CvoC6KHQIo=; b=CGwbto6/ViFZrsOs8kiwksDMkpq634tYwaWX/x7rrKAVIuv7AgNs0IxEPgYbPm822eNBBK B8F+9xzUxG4gXhpYyJikGPu7gv/2B5BPciqhj77vQ3OPIdItqTf8TEK9pECdYFkHr9mjhs /lMVF3Kq8gGUgHtnBxDruY+TQ7j8+Vk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-14-4NWzNcSxOBKwgJ5puyVL_Q-1; Thu, 04 Apr 2024 14:50:35 -0400 X-MC-Unique: 4NWzNcSxOBKwgJ5puyVL_Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5FEBB85A588; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 39C4E1C060A4; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 03/11] KVM: guest_memfd: pass error up from filemap_grab_folio Date: Thu, 4 Apr 2024 14:50:25 -0400 Message-ID: <20240404185034.3184582-4-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 Some SNP ioctls will require the page not to be in the pagecache, and as such they will want to return EEXIST to userspace. Start by passing the error up from filemap_grab_folio. Signed-off-by: Paolo Bonzini --- virt/kvm/guest_memfd.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 5a929536ecf2..409cf9b51313 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -20,7 +20,7 @@ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) /* TODO: Support huge pages. */ folio = filemap_grab_folio(inode->i_mapping, index); if (IS_ERR_OR_NULL(folio)) - return NULL; + return folio; /* * Use the up-to-date flag to track whether or not the memory has been @@ -146,8 +146,8 @@ static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) } folio = kvm_gmem_get_folio(inode, index); - if (!folio) { - r = -ENOMEM; + if (IS_ERR_OR_NULL(folio)) { + r = folio ? PTR_ERR(folio) : -ENOMEM; break; } From patchwork Thu Apr 4 18:50:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618144 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91B7F1332A7 for ; Thu, 4 Apr 2024 18:50:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256644; cv=none; b=Vbxao7w+Hfz0s2N0FA0vwU7JdfYdJEuFmAX+H1/SvJI0Lh68Ngl7ntXLK4SGEogW33HfW3+MHEeiw5ZlwBi226gHI8DFnjVpcVtrX/b6bq7FHuz/RSlPFyRuH4PQwyfRr1ensC4gtqbBUR2WMmvp+rub3VHo/joDjVz7ibLoZss= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256644; c=relaxed/simple; bh=VbgH6uZaEHHyV8AL0EthlfR7wtEenAYOlnIUsv2wVH0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FJZ+7NnAcKt57Obe8qztcFaP1bJZc4s4eEvEMyDpcenQDZdVoWx8z3L13bdoinTle6takilQvE7NjyGUBOvFN10YYKDuNyGH/FufEJP8+QhtUDtT9NhuNbLcFr5UMqbgbH8ig1i0LgubEPyiSIgOyvMAp551T1wWZhPnPbOA7x0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=a0aZuJ02; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="a0aZuJ02" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K+P0fAcIDgRo831gUWKyfWbrgU4oYVbYxE0Ac1b196I=; b=a0aZuJ020nd4CZAouQuurVF7TmhaAifHGyqcvQ2dUva7cm/FwQZONnu491cm6wDgmxiw7N ZKthykuyVCDy1GAwGti3atursIg2I4gd8rWEr152Jy8vfFPZMt/9q2nrFZykrFg2rgADEH 6MQ7FtDUWGEiDl5aAVExsxtgtzcqp28= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-232-3pSD-D3-MhqNQ8Jx0YD62w-1; Thu, 04 Apr 2024 14:50:36 -0400 X-MC-Unique: 3pSD-D3-MhqNQ8Jx0YD62w-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9988985CE41; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 687EE1C060A4; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com, Matthew Wilcox , Yosry Ahmed Subject: [PATCH 04/11] filemap: add FGP_CREAT_ONLY Date: Thu, 4 Apr 2024 14:50:26 -0400 Message-ID: <20240404185034.3184582-5-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 KVM would like to add a ioctl to encrypt and install a page into private memory (i.e. into a guest_memfd), in preparation for launching an encrypted guest. This API should be used only once per page (unless there are failures), so we want to rule out the possibility of operating on a page that is already in the guest_memfd's filemap. Overwriting the page is almost certainly a sign of a bug, so we might as well forbid it. Therefore, introduce a new flag for __filemap_get_folio (to be passed together with FGP_CREAT) that allows *adding* a new page to the filemap but not returning an existing one. An alternative possibility would be to force KVM users to initialize the whole filemap in one go, but that is complicated by the fact that the filemap includes pages of different kinds, including some that are per-vCPU rather than per-VM. Basically the result would be closer to a system call that multiplexes multiple ioctls, than to something cleaner like readv/writev. Races between callers that pass FGP_CREAT_ONLY are uninteresting to the filemap code: one of the racers wins and one fails with EEXIST, similar to calling open(2) with O_CREAT|O_EXCL. It doesn't matter to filemap.c if the missing synchronization is in the kernel or in userspace, and in fact it could even be intentional. (In the case of KVM it turns out that a mutex is taken around these calls for unrelated reasons, so there can be no races.) Cc: Matthew Wilcox Cc: Yosry Ahmed Signed-off-by: Paolo Bonzini --- include/linux/pagemap.h | 2 ++ mm/filemap.c | 4 ++++ 2 files changed, 6 insertions(+) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index f879c1d54da7..a8c0685e8c08 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -587,6 +587,7 @@ pgoff_t page_cache_prev_miss(struct address_space *mapping, * * %FGP_CREAT - If no folio is present then a new folio is allocated, * added to the page cache and the VM's LRU list. The folio is * returned locked. + * * %FGP_CREAT_ONLY - Fail if a folio is present * * %FGP_FOR_MMAP - The caller wants to do its own locking dance if the * folio is already in cache. If the folio was allocated, unlock it * before returning so the caller can do the same dance. @@ -607,6 +608,7 @@ typedef unsigned int __bitwise fgf_t; #define FGP_NOWAIT ((__force fgf_t)0x00000020) #define FGP_FOR_MMAP ((__force fgf_t)0x00000040) #define FGP_STABLE ((__force fgf_t)0x00000080) +#define FGP_CREAT_ONLY ((__force fgf_t)0x00000100) #define FGF_GET_ORDER(fgf) (((__force unsigned)fgf) >> 26) /* top 6 bits */ #define FGP_WRITEBEGIN (FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE) diff --git a/mm/filemap.c b/mm/filemap.c index 7437b2bd75c1..e7440e189ebd 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1863,6 +1863,10 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, folio = NULL; if (!folio) goto no_page; + if (fgp_flags & FGP_CREAT_ONLY) { + folio_put(folio); + return ERR_PTR(-EEXIST); + } if (fgp_flags & FGP_LOCK) { if (fgp_flags & FGP_NOWAIT) { From patchwork Thu Apr 4 18:50:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618142 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 076671327EB for ; Thu, 4 Apr 2024 18:50:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256644; cv=none; b=HH/TpIOfgjD2+hP3JGUdA88VbP46UOCxcXgWXy/PqPJCFF4cD+nAhiUQSSPHYrAuS9R653JuLt25F/DkvCdOH5QMaOIK5KwQ7gaTzOvXWsDJ26xagoZkrx0B0mjNAwKU7kBTdpQLy0HjPStAZTHI8TQewRpLNxLDVm0noxuqFVQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256644; c=relaxed/simple; bh=8qvwxxQIeBV1Kd9XCFuoOg02QXfI1v1RFVcHAf+nVjY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VwEcEXwHohtAopGOeWyHNnp1pHgcFdBJOtIz7+sT0qIVFy27FaF3klBuMJ+YW8/lvByxCBJSabXjE8BWPvrAgnWo6eroo5/dUC0WWbzn9SEBo6poI5Xu1RfktAiZM3ppseDacmr1bLm3kvo02PmADcBH420qkDpMQzLWTTGHXLE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Emmsvbvb; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Emmsvbvb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nCDNNjDt/LfQbgJv7GVyWQGXDRpuwbJ3WGdMzH1/Yp4=; b=EmmsvbvbAwFCXw8jzY4TwT02j3Z/Z/IWC86A0NHRyJTUM6Jxe0rVEZa7UuCvW0jQvi8UAp wHiDgWps2fJDPsWycLOL6j92FULoCirH1IN78G4RsuhydUvM+NfDlOSgBejPgbDvLdnK0j oK2eClecDlEOfmTB3R8N5TJeYC3ERro= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-629-y1SfbgcDMBylaadPFTIj5A-1; Thu, 04 Apr 2024 14:50:36 -0400 X-MC-Unique: y1SfbgcDMBylaadPFTIj5A-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C75113C0F190; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id A15391C060A4; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 05/11] KVM: guest_memfd: limit overzealous WARN Date: Thu, 4 Apr 2024 14:50:27 -0400 Message-ID: <20240404185034.3184582-6-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 Because kvm_gmem_get_pfn() is called from the page fault path without any of the slots_lock, filemap lock or mmu_lock taken, it is possible for it to race with kvm_gmem_unbind(). This is not a problem, as any PTE that is installed temporarily will be zapped before the guest has the occasion to run. However, it is not possible to have a complete unbind+bind racing with the page fault, because deleting the memslot will call synchronize_srcu_expedited() and wait for the page fault to be resolved. Thus, we can still warn if the file is there and is not the one we expect. Signed-off-by: Paolo Bonzini --- virt/kvm/guest_memfd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 409cf9b51313..e5b3cd02b651 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -499,7 +499,8 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gmem = file->private_data; - if (WARN_ON_ONCE(xa_load(&gmem->bindings, index) != slot)) { + if (xa_load(&gmem->bindings, index) != slot) { + WARN_ON_ONCE(xa_load(&gmem->bindings, index)); r = -EIO; goto out_fput; } From patchwork Thu Apr 4 18:50:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618147 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41012133425 for ; Thu, 4 Apr 2024 18:50:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256645; cv=none; b=Q/nar2VbS9lEFH4EuwDOtuXCk9Kc7c59pQhsB1TFlJ6xK0qA9ncsau2QL66n8xB1IShGAIwq5tXqlvJXs6hGNT42pIwrKdBet3Pmd2nVb6MiaPUZ6LgzMluLjBgdTunxB/73mlHz/Zxjo0RmktiQX/OnOMAxE3LIxNzrm875dr8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256645; c=relaxed/simple; bh=B3S0LcYTsWFD+kUOBskvthZWdEqFSYGDHrnXM3mb+f4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Xu2NPQZSAc7MZVkOcEhOPGnKXKShinjFNDQ3ZvqFYwNA7k2+wPUUpwXLwVu94/ISJaoGNnMLhe9+QNgsmLvaXyXx6oc453y3NZOfwp0kwcaZElHcdytrFkcAkbNUena38CKDrK+T2zZ4PiVqJmyjyq/Xj2woHYPsA0miCK0cDsY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Hx+dcR9T; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Hx+dcR9T" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256641; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4+/Ieq4l7FZQ/9j1N8+GqzvijiucWTeDAVP6XMMN/jA=; b=Hx+dcR9THa+s/hmu44RWhhnji4Ciz+ajsA8mOdAp47l2dVZ/mfN4F6mFrnPKs0KHKAXj5f hGlv4EaeHs+A6/BC5TpzmHHEIEY7UKkkZCNJEEfXr8vh5A0I94QuyKfB17oFp/GdbSrrZ5 Y7ehQqV4kr31ddMqsuDRTxtlEWS6PFk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-252-YXh2MNpQN2uul4WBoHbJsQ-1; Thu, 04 Apr 2024 14:50:36 -0400 X-MC-Unique: YXh2MNpQN2uul4WBoHbJsQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 047858007BB; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id D274D1C060A4; Thu, 4 Apr 2024 18:50:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 06/11] KVM: guest_memfd: Add hook for initializing memory Date: Thu, 4 Apr 2024 14:50:28 -0400 Message-ID: <20240404185034.3184582-7-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 guest_memfd pages are generally expected to be in some arch-defined initial state prior to using them for guest memory. For SEV-SNP this initial state is 'private', or 'guest-owned', and requires additional operations to move these pages into a 'private' state by updating the corresponding entries the RMP table. Allow for an arch-defined hook to handle updates of this sort, and go ahead and implement one for x86 so KVM implementations like AMD SVM can register a kvm_x86_ops callback to handle these updates for SEV-SNP guests. The preparation callback is always called when allocating/grabbing folios via gmem, and it is up to the architecture to keep track of whether or not the pages are already in the expected state (e.g. the RMP table in the case of SEV-SNP). In some cases, it is necessary to defer the preparation of the pages to handle things like in-place encryption of initial guest memory payloads before marking these pages as 'private'/'guest-owned'. Add an argument (always true for now) to kvm_gmem_get_folio() that allows for the preparation callback to be bypassed. To detect possible issues in the way userspace initializes memory, it is only possible to add an unprepared page if it is not already included in the filemap. Link: https://lore.kernel.org/lkml/ZLqVdvsF11Ddo7Dq@google.com/ Co-developed-by: Michael Roth Signed-off-by: Michael Roth Message-Id: <20231230172351.574091-5-michael.roth@amd.com> Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 6 +++ include/linux/kvm_host.h | 5 +++ virt/kvm/Kconfig | 4 ++ virt/kvm/guest_memfd.c | 65 ++++++++++++++++++++++++++++-- 6 files changed, 78 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 5187fcf4b610..d26fcad13e36 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -139,6 +139,7 @@ KVM_X86_OP(vcpu_deliver_sipi_vector) KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons); KVM_X86_OP_OPTIONAL(get_untagged_addr) KVM_X86_OP_OPTIONAL(alloc_apic_backing_page) +KVM_X86_OP_OPTIONAL_RET0(gmem_prepare) #undef KVM_X86_OP #undef KVM_X86_OP_OPTIONAL diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 01c69840647e..f101fab0040e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1809,6 +1809,7 @@ struct kvm_x86_ops { gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags); void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu); + int (*gmem_prepare)(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2d2619d3eee4..972524ddcfdb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13598,6 +13598,12 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_arch_no_poll); +#ifdef CONFIG_HAVE_KVM_GMEM_PREPARE +int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_order) +{ + return static_call(kvm_x86_gmem_prepare)(kvm, pfn, gfn, max_order); +} +#endif int kvm_spec_ctrl_test_value(u64 value) { diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 48f31dcd318a..33ed3b884a6b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2445,4 +2445,9 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm, } #endif /* CONFIG_KVM_PRIVATE_MEM */ +#ifdef CONFIG_HAVE_KVM_GMEM_PREPARE +int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_order); +bool kvm_arch_gmem_prepare_needed(struct kvm *kvm); +#endif + #endif diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 29b73eedfe74..ca870157b2ed 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -109,3 +109,7 @@ config KVM_GENERIC_PRIVATE_MEM select KVM_GENERIC_MEMORY_ATTRIBUTES select KVM_PRIVATE_MEM bool + +config HAVE_KVM_GMEM_PREPARE + bool + depends on KVM_PRIVATE_MEM diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index e5b3cd02b651..486748e65f36 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -13,12 +13,60 @@ struct kvm_gmem { struct list_head entry; }; -static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) +#ifdef CONFIG_HAVE_KVM_GMEM_PREPARE +bool __weak kvm_arch_gmem_prepare_needed(struct kvm *kvm) +{ + return false; +} +#endif + +static int kvm_gmem_prepare_folio(struct inode *inode, pgoff_t index, struct folio *folio) +{ +#ifdef CONFIG_HAVE_KVM_GMEM_PREPARE + struct list_head *gmem_list = &inode->i_mapping->i_private_list; + struct kvm_gmem *gmem; + + list_for_each_entry(gmem, gmem_list, entry) { + struct kvm_memory_slot *slot; + struct kvm *kvm = gmem->kvm; + struct page *page; + kvm_pfn_t pfn; + gfn_t gfn; + int rc; + + if (!kvm_arch_gmem_prepare_needed(kvm)) + continue; + + slot = xa_load(&gmem->bindings, index); + if (!slot) + continue; + + page = folio_file_page(folio, index); + pfn = page_to_pfn(page); + gfn = slot->base_gfn + index - slot->gmem.pgoff; + rc = kvm_arch_gmem_prepare(kvm, gfn, pfn, compound_order(compound_head(page))); + if (rc) { + pr_warn_ratelimited("gmem: Failed to prepare folio for index %lx, error %d.\n", + index, rc); + return rc; + } + } + +#endif + return 0; +} + +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index, bool prepare) { struct folio *folio; + fgf_t fgp_flags = FGP_LOCK | FGP_ACCESSED | FGP_CREAT; + + if (!prepare) + fgp_flags |= FGP_CREAT_ONLY; /* TODO: Support huge pages. */ - folio = filemap_grab_folio(inode->i_mapping, index); + folio = __filemap_get_folio(inode->i_mapping, index, fgp_flags, + mapping_gfp_mask(inode->i_mapping)); if (IS_ERR_OR_NULL(folio)) return folio; @@ -41,6 +89,15 @@ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) folio_mark_uptodate(folio); } + if (prepare) { + int r = kvm_gmem_prepare_folio(inode, index, folio); + if (r < 0) { + folio_unlock(folio); + folio_put(folio); + return ERR_PTR(r); + } + } + /* * Ignore accessed, referenced, and dirty flags. The memory is * unevictable and there is no storage to write back to. @@ -145,7 +202,7 @@ static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) break; } - folio = kvm_gmem_get_folio(inode, index); + folio = kvm_gmem_get_folio(inode, index, true); if (IS_ERR_OR_NULL(folio)) { r = folio ? PTR_ERR(folio) : -ENOMEM; break; @@ -505,7 +562,7 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, goto out_fput; } - folio = kvm_gmem_get_folio(file_inode(file), index); + folio = kvm_gmem_get_folio(file_inode(file), index, true); if (!folio) { r = -ENOMEM; goto out_fput; From patchwork Thu Apr 4 18:50:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618143 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 011EC131E43 for ; Thu, 4 Apr 2024 18:50:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256644; cv=none; b=gIRTisHZkJPtcMLyfm+GXBW2o9SuQyE6PWdJcvpXGmJw/LnbtHqX7xD+kDn5D5NKrzv0pAOSJfcGquSABnGwy4FG8Ja6Ys6ASrQ+95IPeo4OyonXtn2vvNawZ3Cggbzd5vJUEUTu9aqxZu0zsMkRA2D4aQqeisKQnKe3a67MX/0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256644; c=relaxed/simple; bh=DRZa6qCcwIxu3J6+pogOtSfaCQJILFOI4TxKUix+niw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CscfDMoeq2IRSRwvzD0DxLmFjHxB7qpJNB5PDjBLlyHHPu+CLwHa5owg7JWKhQGrvegkvq6CsUy8UlWu3qwswIR5eQ8eWcPpNZV+jzaV97eCVUtIdU+567bxNzNsmWLHOypqjrb7D8hQn3h+9QgtPp94+L8Uq2NkWzAJ3tmQ2wU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=RhCejrIN; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RhCejrIN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256639; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sBXensWpE7Hh5cackwpZC/rR1Lhmxd3u0UUm9CSpoqA=; b=RhCejrINcRL1hWfJ+4QU4aKkeqCNzeJm1nQSEVwKtwpk+b6sd1QqXqDD93do6xKTzd+AAB krk7lyhWcSVYTQjShvJ0a1uTxYV0xKsA4IBrmPuSw/fpjaZuMqHq+IBorGPox0qB0sDksg edGm9g1DqME8yaaBPM5hSrsXdyIZxR4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-208-MlBolUypPySfhWFHt7vqkA-1; Thu, 04 Apr 2024 14:50:36 -0400 X-MC-Unique: MlBolUypPySfhWFHt7vqkA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 33930101A531; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0D1641C060A4; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 07/11] KVM: guest_memfd: extract __kvm_gmem_get_pfn() Date: Thu, 4 Apr 2024 14:50:29 -0400 Message-ID: <20240404185034.3184582-8-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 In preparation for adding a function that walks a set of pages provided by userspace and populates them in a guest_memfd, add a version of kvm_gmem_get_pfn() that has a "bool prepare" argument and passes it down to kvm_gmem_get_folio(). Populating guest memory has to call repeatedly __kvm_gmem_get_pfn() on the same file, so make the new function take struct file*. Signed-off-by: Paolo Bonzini Reviewed-by: Michael Roth --- virt/kvm/guest_memfd.c | 38 +++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 486748e65f36..a537a7e63ab5 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -540,33 +540,29 @@ void kvm_gmem_unbind(struct kvm_memory_slot *slot) fput(file); } -int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order, bool prepare) { pgoff_t index = gfn - slot->base_gfn + slot->gmem.pgoff; - struct kvm_gmem *gmem; + struct kvm_gmem *gmem = file->private_data; struct folio *folio; struct page *page; - struct file *file; int r; - file = kvm_gmem_get_file(slot); - if (!file) + if (file != slot->gmem.file) { + WARN_ON_ONCE(slot->gmem.file); return -EFAULT; + } gmem = file->private_data; - if (xa_load(&gmem->bindings, index) != slot) { WARN_ON_ONCE(xa_load(&gmem->bindings, index)); - r = -EIO; - goto out_fput; + return -EIO; } folio = kvm_gmem_get_folio(file_inode(file), index, true); - if (!folio) { - r = -ENOMEM; - goto out_fput; - } + if (!folio) + return -ENOMEM; if (folio_test_hwpoison(folio)) { r = -EHWPOISON; @@ -583,9 +579,21 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, out_unlock: folio_unlock(folio); -out_fput: - fput(file); return r; } + +int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +{ + struct file *file = kvm_gmem_get_file(slot); + int r; + + if (!file) + return -EFAULT; + + r = __kvm_gmem_get_pfn(file, slot, gfn, pfn, max_order, true); + fput(file); + return r; +} EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); From patchwork Thu Apr 4 18:50:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618145 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDCE613440C for ; Thu, 4 Apr 2024 18:50:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256645; cv=none; b=qKvWeHHrGOCmnqpHXrO+HwV2y4XIRY/FDMxezSw5RuJELY6S3N+a7E9qZZoZ9FvzNsVjXPYv+PsKvcIkJDLRC/Vt1+zukctWLuOFAHrPd19AYZ/7NOBRECxKkLi5mN4SrjDBJs7LmmxmYJQFoM5nLerd0U36qeSAlsJM/b6YJ+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256645; c=relaxed/simple; bh=jdNIfZWZvPJII98ywtATikTD6VDJZQYwABwABAMUGS8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nQ8rO6U4ThhWQrGeYWpynM6FzlJ6tjLNagYsVcT7es/r6bgTOhZt3r9I2N/bZZHKCERfhZfIR6jTLv5l3OBKDLbKpD/VQHZJqWa5dCCVaMLMx+XQXGwLHhTrkDJ4new800bZ4xVK7/47AjTjwu9Cwd6Z28WRJnWX77prjDBUVYQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=hvPvBNkc; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hvPvBNkc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256642; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JFajdk4fDiyh8VxZ37SkZXsZFlO1RmoQzZVvkfEWg0k=; b=hvPvBNkc5oZFwi8mSdmNZJsdkqYq/IBy+w6KonJPl0CF2EaBdHVVEd8Z3Ly8+L7ZqwTHVw z+OT76BoWJT6fs4aDR4Fjlr6g0X1ALnijbr3nkq/Rebqztga6PvgcRqOFJjo1pcdeFiP4O PWjZ1OE/t6+czWbnRHDW6JcFJLrzRq4= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-64-rFD63vFJMpWHpRDgCAYm6Q-1; Thu, 04 Apr 2024 14:50:36 -0400 X-MC-Unique: rFD63vFJMpWHpRDgCAYm6Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6107728B7402; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3B46D1C060A4; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 08/11] KVM: guest_memfd: extract __kvm_gmem_punch_hole() Date: Thu, 4 Apr 2024 14:50:30 -0400 Message-ID: <20240404185034.3184582-9-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 Extract a version of kvm_gmem_punch_hole() that expects the caller to take the filemap invalidate_lock. Signed-off-by: Paolo Bonzini --- virt/kvm/guest_memfd.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index a537a7e63ab5..51c99667690a 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -152,19 +152,12 @@ static void kvm_gmem_invalidate_end(struct kvm_gmem *gmem, pgoff_t start, } } -static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) +static long __kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) { struct list_head *gmem_list = &inode->i_mapping->i_private_list; pgoff_t start = offset >> PAGE_SHIFT; pgoff_t end = (offset + len) >> PAGE_SHIFT; struct kvm_gmem *gmem; - - /* - * Bindings must be stable across invalidation to ensure the start+end - * are balanced. - */ - filemap_invalidate_lock(inode->i_mapping); - list_for_each_entry(gmem, gmem_list, entry) kvm_gmem_invalidate_begin(gmem, start, end); @@ -173,11 +166,23 @@ static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) list_for_each_entry(gmem, gmem_list, entry) kvm_gmem_invalidate_end(gmem, start, end); - filemap_invalidate_unlock(inode->i_mapping); - return 0; } +static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) +{ + int r; + + /* + * Bindings must be stable across invalidation to ensure the start+end + * are balanced. + */ + filemap_invalidate_lock(inode->i_mapping); + r = __kvm_gmem_punch_hole(inode, offset, len); + filemap_invalidate_unlock(inode->i_mapping); + return r; +} + static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) { struct address_space *mapping = inode->i_mapping; From patchwork Thu Apr 4 18:50:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618140 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5321613249E for ; Thu, 4 Apr 2024 18:50:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256643; cv=none; b=TvGD6oDhwfRF0lasW1mu0EAmQRYUvhsgnG7RcpzjiX+GC3DHY11rV/Wol9KIS0u0BkKeAWLBhzzJ1kcslqKDczmqzCN6dqBDEpxDcrIcHWGZm1DK8NurAIXqZhjIScn57krbDCnOb75AfLGw4lVM9WTp+tL7dypPjj757LWe4PU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256643; c=relaxed/simple; bh=sZo1i1kLk+XztE3tIXuKTUlt1pxUJrHw1JJ1hwkgtC0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=a65wS4O+IRMmazrdrPewo0JVyvKF/pCwTQ6s4C7rjjpYSCgqN9gB5Dt9WLDTlQRDghK2DITR8pubIYFrp55dJeuVBB8W3qZrYUp9YWQSKvZbWjlFRJT+rVfeW/T34plGvwrAfO5o1AAw4Jas+x1wEaGpmK+1Ysg07yA3WBXisd8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=eCHC+Uw0; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eCHC+Uw0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FvyoR/T19B7DHISxWoVI2byScY2P/MdC/oqveCWAOaQ=; b=eCHC+Uw0Ut4RSaxCJf8OpoSVy0TgFezIghDO8uE2u5KAbFBIEySHbCHEUMAZbLW7U03poE FMj9w9XaLBdUWKLSxyYxHK3xuFTLsDuYJRq/4VFQrRgR8pKvp00Db5OhKn4n7bDP0tMH+H gkoNddSunTI+7ICY1wY5pQINI42gzWU= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-64-Au_eUc0kONmIKNzTOgtqwA-1; Thu, 04 Apr 2024 14:50:36 -0400 X-MC-Unique: Au_eUc0kONmIKNzTOgtqwA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 91C4B3C0F192; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A9E11C060A4; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data Date: Thu, 4 Apr 2024 14:50:31 -0400 Message-ID: <20240404185034.3184582-10-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 During guest run-time, kvm_arch_gmem_prepare() is issued as needed to prepare newly-allocated gmem pages prior to mapping them into the guest. In the case of SEV-SNP, this mainly involves setting the pages to private in the RMP table. However, for the GPA ranges comprising the initial guest payload, which are encrypted/measured prior to starting the guest, the gmem pages need to be accessed prior to setting them to private in the RMP table so they can be initialized with the userspace-provided data. Additionally, an SNP firmware call is needed afterward to encrypt them in-place and measure the contents into the guest's launch digest. While it is possible to bypass the kvm_arch_gmem_prepare() hooks so that this handling can be done in an open-coded/vendor-specific manner, this may expose more gmem-internal state/dependencies to external callers than necessary. Try to avoid this by implementing an interface that tries to handle as much of the common functionality inside gmem as possible, while also making it generic enough to potentially be usable/extensible for TDX as well. Suggested-by: Sean Christopherson Signed-off-by: Michael Roth Co-developed-by: Michael Roth Signed-off-by: Paolo Bonzini Signed-off-by: Isaku Yamahata --- include/linux/kvm_host.h | 26 ++++++++++++++ virt/kvm/guest_memfd.c | 78 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 104 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 33ed3b884a6b..97d57ec59789 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2450,4 +2450,30 @@ int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_ord bool kvm_arch_gmem_prepare_needed(struct kvm *kvm); #endif +/** + * kvm_gmem_populate() - Populate/prepare a GPA range with guest data + * + * @kvm: KVM instance + * @gfn: starting GFN to be populated + * @src: userspace-provided buffer containing data to copy into GFN range + * (passed to @post_populate, and incremented on each iteration + * if not NULL) + * @npages: number of pages to copy from userspace-buffer + * @post_populate: callback to issue for each gmem page that backs the GPA + * range + * @opaque: opaque data to pass to @post_populate callback + * + * This is primarily intended for cases where a gmem-backed GPA range needs + * to be initialized with userspace-provided data prior to being mapped into + * the guest as a private page. This should be called with the slots->lock + * held so that caller-enforced invariants regarding the expected memory + * attributes of the GPA range do not race with KVM_SET_MEMORY_ATTRIBUTES. + * + * Returns the number of pages that were populated. + */ +long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages, + int (*post_populate)(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, + void __user *src, int order, void *opaque), + void *opaque); + #endif diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 51c99667690a..e7de97382a67 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -602,3 +602,81 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, return r; } EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); + +static int kvm_gmem_undo_get_pfn(struct file *file, struct kvm_memory_slot *slot, + gfn_t gfn, int order) +{ + pgoff_t index = gfn - slot->base_gfn + slot->gmem.pgoff; + struct kvm_gmem *gmem = file->private_data; + + /* + * Races with kvm_gmem_unbind() must have been detected by + * __kvm_gmem_get_gfn(), because the invalidate_lock is + * taken between __kvm_gmem_get_gfn() and kvm_gmem_undo_get_pfn(). + */ + if (WARN_ON_ONCE(xa_load(&gmem->bindings, index) != slot)) + return -EIO; + + return __kvm_gmem_punch_hole(file_inode(file), index << PAGE_SHIFT, PAGE_SIZE << order); +} + +long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages, + int (*post_populate)(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, + void __user *src, int order, void *opaque), + void *opaque) +{ + struct file *file; + struct kvm_memory_slot *slot; + + int ret = 0, max_order; + long i; + + lockdep_assert_held(&kvm->slots_lock); + if (npages < 0) + return -EINVAL; + + slot = gfn_to_memslot(kvm, gfn); + if (!kvm_slot_can_be_private(slot)) + return -EINVAL; + + file = kvm_gmem_get_file(slot); + if (!file) + return -EFAULT; + + filemap_invalidate_lock(file->f_mapping); + + npages = min_t(ulong, slot->npages - (gfn - slot->base_gfn), npages); + for (i = 0; i < npages; i += (1 << max_order)) { + gfn_t this_gfn = gfn + i; + kvm_pfn_t pfn; + + ret = __kvm_gmem_get_pfn(file, slot, this_gfn, &pfn, &max_order, false); + if (ret) + break; + + if (!IS_ALIGNED(this_gfn, (1 << max_order)) || + (npages - i) < (1 << max_order)) + max_order = 0; + + if (post_populate) { + void __user *p = src ? src + i * PAGE_SIZE : NULL; + ret = post_populate(kvm, this_gfn, pfn, p, max_order, opaque); + } + + put_page(pfn_to_page(pfn)); + if (ret) { + /* + * Punch a hole so that FGP_CREAT_ONLY can succeed + * again. + */ + kvm_gmem_undo_get_pfn(file, slot, this_gfn, max_order); + break; + } + } + + filemap_invalidate_unlock(file->f_mapping); + + fput(file); + return ret && !i ? ret : i; +} +EXPORT_SYMBOL_GPL(kvm_gmem_populate); From patchwork Thu Apr 4 18:50:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618141 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FB4F1332A1 for ; Thu, 4 Apr 2024 18:50:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256643; cv=none; b=KoH5TIK8akXnXikFRxu1AEXIqyok5BEjkMiHJ2MCvNfhhsG4GTw9UUxemhT29WK1gPR/m3HbHqSC/9Q+QaPyv3bqVe+lLHXkqnowCUQQ2fY10cRf4JCVX80A3xNwMccKozTj5K8Z4wNURuFARFHE65eAB33GnHtA/jHSWo5WihE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256643; c=relaxed/simple; bh=lH7jR8Jko64IRgmkF8sRM1H2UWKCIFsqlXz2X8glJ+Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mxk1M0Qr+obnVzD3OCtVscfA+tXpwfxd9BO7AV9jxrAYR5d7bVm++NN8XmIVxSE4YbTCRgc4xgCDMt4qvy7tae5ZKjm/6Mmt6nI0C3EOSjwFylijlXcK7MB+624tdC9Z0w3SISOJTT4vE1C6V4YTS5fpH5a0QL2kQaO5LcaTAKk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=GU9GywVC; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GU9GywVC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BGuDZxdd8LJgYrU56BO8/CIic8VVHhfGx5lnP9x/JCo=; b=GU9GywVCPgKwSqd8rYgTe81RrxrItpsCnfxwVM3LQv2I/4bHv2OtyjkhFOdSGCFD9o2edA o6c/Z65pXMZ7piZr1GJhBm2A0xPsc6DjQkGxDgyBoYuCCjcDTcVbD8bp0Ol7EOjTJJGABy +qT1vtLDFUeIxMAqsm48kO/xLSH96wo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-42-TU-jMcaSNrCubAMbdTs3rg-1; Thu, 04 Apr 2024 14:50:37 -0400 X-MC-Unique: TU-jMcaSNrCubAMbdTs3rg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DA5EF8007A1; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id B2EF7200A386; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 10/11] KVM: guest_memfd: Add hook for invalidating memory Date: Thu, 4 Apr 2024 14:50:32 -0400 Message-ID: <20240404185034.3184582-11-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Michael Roth In some cases, like with SEV-SNP, guest memory needs to be updated in a platform-specific manner before it can be safely freed back to the host. Wire up arch-defined hooks to the .free_folio kvm_gmem_aops callback to allow for special handling of this sort when freeing memory in response to FALLOC_FL_PUNCH_HOLE operations and when releasing the inode, and go ahead and define an arch-specific hook for x86 since it will be needed for handling memory used for SEV-SNP guests. Signed-off-by: Michael Roth Message-Id: <20231230172351.574091-6-michael.roth@amd.com> Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 7 +++++++ include/linux/kvm_host.h | 4 ++++ virt/kvm/Kconfig | 4 ++++ virt/kvm/guest_memfd.c | 14 ++++++++++++++ 6 files changed, 31 insertions(+) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index d26fcad13e36..c81990937ab4 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -140,6 +140,7 @@ KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons); KVM_X86_OP_OPTIONAL(get_untagged_addr) KVM_X86_OP_OPTIONAL(alloc_apic_backing_page) KVM_X86_OP_OPTIONAL_RET0(gmem_prepare) +KVM_X86_OP_OPTIONAL(gmem_invalidate) #undef KVM_X86_OP #undef KVM_X86_OP_OPTIONAL diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f101fab0040e..59c7b95034fc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1810,6 +1810,7 @@ struct kvm_x86_ops { gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags); void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu); int (*gmem_prepare)(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order); + void (*gmem_invalidate)(kvm_pfn_t start, kvm_pfn_t end); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 972524ddcfdb..83b8260443a3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13605,6 +13605,13 @@ int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_ord } #endif +#ifdef CONFIG_HAVE_KVM_GMEM_INVALIDATE +void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end) +{ + static_call_cond(kvm_x86_gmem_invalidate)(start, end); +} +#endif + int kvm_spec_ctrl_test_value(u64 value) { /* diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 97d57ec59789..ab439706ea2f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2476,4 +2476,8 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages void __user *src, int order, void *opaque), void *opaque); +#ifdef CONFIG_HAVE_KVM_GMEM_INVALIDATE +void kvm_arch_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end); +#endif + #endif diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index ca870157b2ed..754c6c923427 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -113,3 +113,7 @@ config KVM_GENERIC_PRIVATE_MEM config HAVE_KVM_GMEM_PREPARE bool depends on KVM_PRIVATE_MEM + +config HAVE_KVM_GMEM_INVALIDATE + bool + depends on KVM_PRIVATE_MEM diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index e7de97382a67..d6b92d9b935a 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -360,10 +360,24 @@ static int kvm_gmem_error_folio(struct address_space *mapping, struct folio *fol return MF_DELAYED; } +#ifdef CONFIG_HAVE_KVM_GMEM_INVALIDATE +static void kvm_gmem_free_folio(struct folio *folio) +{ + struct page *page = folio_page(folio, 0); + kvm_pfn_t pfn = page_to_pfn(page); + int order = folio_order(folio); + + kvm_arch_gmem_invalidate(pfn, pfn + (1ul << order)); +} +#endif + static const struct address_space_operations kvm_gmem_aops = { .dirty_folio = noop_dirty_folio, .migrate_folio = kvm_gmem_migrate_folio, .error_remove_folio = kvm_gmem_error_folio, +#ifdef CONFIG_HAVE_KVM_GMEM_INVALIDATE + .free_folio = kvm_gmem_free_folio, +#endif }; static int kvm_gmem_getattr(struct mnt_idmap *idmap, const struct path *path, From patchwork Thu Apr 4 18:50:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13618137 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 10BD8131BB9 for ; Thu, 4 Apr 2024 18:50:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256641; cv=none; b=WU3QZT1Y2hTw5MBtU5XmG8VaPD/ltydDQ+0sKu9f2Rxua1ex3gZYCvIGeQc7lz+uPs1YaK6VsjherlEN4iaFxXKETqj+NaOqV8r2Sm57dvTiuMRZLqj4+yxZHhjteq0F5EXBB4q0tTrNY2b9grGRrf0OkdXqtAoYcijX7f/Y7ko= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712256641; c=relaxed/simple; bh=JhCLT6R3dEV4Z8kPXKqhhzXcGFUKttw3pZHa+JT0OpU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=W1LP3VJfTvpoZ6/L17bMmHvnouIDLnls1IfVrWX6qUT+H98DQ2+4of8nki+MwNPYDH6jQmn9DrgRbXpi7NeEOjostm9I5R+ici9g4YpAksiqGLQjo0w1q4mxzgygxXH7VJsb17IaPJEFEWGk0Ce7yl5OiVF7n9jfK+YCreBibdE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XN6XcKmF; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XN6XcKmF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712256639; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Nc3SKOOYPHg3uIsI8oCYk9QWrQ5YHEK98RfXdPCT6Bs=; b=XN6XcKmFkE97tmm1QFxW0IIIuub5EE4loURy+AO7p+kG/GkY3UVRsmkXcPhHp4APaT1rzO gldh4UEbqW47if+droSgyLUKRIQQvuET9apa7VsGbahOWfzcvM4IOxGoAWF018mHWCBUe/ UvyAA6xB4QOPlTjY3Yb16xcFto9CiNo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-58-9DwWGaL3OZ2V6bZuCBaOTQ-1; Thu, 04 Apr 2024 14:50:37 -0400 X-MC-Unique: 9DwWGaL3OZ2V6bZuCBaOTQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 157CE85A5B7; Thu, 4 Apr 2024 18:50:37 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id E2B8D2024517; Thu, 4 Apr 2024 18:50:36 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, michael.roth@amd.com, isaku.yamahata@intel.com Subject: [PATCH 11/11] KVM: x86: Add gmem hook for determining max NPT mapping level Date: Thu, 4 Apr 2024 14:50:33 -0400 Message-ID: <20240404185034.3184582-12-pbonzini@redhat.com> In-Reply-To: <20240404185034.3184582-1-pbonzini@redhat.com> References: <20240404185034.3184582-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 From: Michael Roth In the case of SEV-SNP, whether or not a 2MB page can be mapped via a 2MB mapping in the guest's nested page table depends on whether or not any subpages within the range have already been initialized as private in the RMP table. The existing mixed-attribute tracking in KVM is insufficient here, for instance: - gmem allocates 2MB page - guest issues PVALIDATE on 2MB page - guest later converts a subpage to shared - SNP host code issues PSMASH to split 2MB RMP mapping to 4K - KVM MMU splits NPT mapping to 4K At this point there are no mixed attributes, and KVM would normally allow for 2MB NPT mappings again, but this is actually not allowed because the RMP table mappings are 4K and cannot be promoted on the hypervisor side, so the NPT mappings must still be limited to 4K to match this. Add a hook to determine the max NPT mapping size in situations like this. Signed-off-by: Michael Roth Message-Id: <20231230172351.574091-31-michael.roth@amd.com> Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu/mmu.c | 8 ++++++++ 3 files changed, 11 insertions(+) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index c81990937ab4..2db87a6fd52a 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -140,6 +140,7 @@ KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons); KVM_X86_OP_OPTIONAL(get_untagged_addr) KVM_X86_OP_OPTIONAL(alloc_apic_backing_page) KVM_X86_OP_OPTIONAL_RET0(gmem_prepare) +KVM_X86_OP_OPTIONAL_RET0(gmem_validate_fault) KVM_X86_OP_OPTIONAL(gmem_invalidate) #undef KVM_X86_OP diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 59c7b95034fc..67dc108dd366 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1811,6 +1811,8 @@ struct kvm_x86_ops { void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu); int (*gmem_prepare)(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, int max_order); void (*gmem_invalidate)(kvm_pfn_t start, kvm_pfn_t end); + int (*gmem_validate_fault)(struct kvm *kvm, kvm_pfn_t pfn, gfn_t gfn, bool is_private, + u8 *max_level); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 992e651540e8..13dd367b8af1 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4338,6 +4338,14 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, fault->max_level); fault->map_writable = !(fault->slot->flags & KVM_MEM_READONLY); + r = static_call(kvm_x86_gmem_validate_fault)(vcpu->kvm, fault->pfn, + fault->gfn, fault->is_private, + &fault->max_level); + if (r) { + kvm_release_pfn_clean(fault->pfn); + return r; + } + return RET_PF_CONTINUE; }