From patchwork Thu Feb 20 17:05:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984298 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9614B212FB8 for ; Thu, 20 Feb 2025 17:06:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071177; cv=none; b=N7gvKLBMzPzQ4+XqgxM+2UEmp9qNSrl62omuNGSqCIevW6sgFeb9aJZGFYd1Kc+rqTKQzFwRXteweMUBH0xcjV0ZSbDnTZuFuvKDKiDYDd2JUIp7cFeLMhMjhMsW6zX2catJt+0IhnvIIj8wI/X29ieHgG8LmdpLJjvd9mW8Np0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071177; c=relaxed/simple; bh=aUmImd2Dmi0hdkdJs2P8lDWMzgvoAHfBZ2DkehgYL6E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gln+GR8Oekw5Wg17mM18m1+DgOBQPnWsfEmaLlgPN7TNPKetTonteeR4oyPZPV8Pl3crq3alSQfMU6IchS8cgPu14CMjhysJYJRxGJif9r+TA6zOLwqHXgLtTMglkGus954VxbSU7gi2dVQdj6JDkGxIkxnbP2Opw8LQHUDHiS0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=LQgR0vRT; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LQgR0vRT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tqMW6SCVBhYqzefmY1pRMp1zvszgoP9a+Clap5bGIZw=; b=LQgR0vRTQpw4zGy8CQhJEcsZybNRoUPDmUYHiyNcWL56WPOmHnxrTe3l3jRCrwgcUEn9Eb iejw+MW3it6HGw2m+bEVe6sPGwwk98iJNK5NYC1KZOqtzdOw+MsIBoApTUwuBZD2dkW7pB UryEsqTjsO1PiSkwhwcwcMkGKse1aSE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-649-5RLiLQYAOn-ACmUa6tGSWA-1; Thu, 20 Feb 2025 12:06:10 -0500 X-MC-Unique: 5RLiLQYAOn-ACmUa6tGSWA-1 X-Mimecast-MFC-AGG-ID: 5RLiLQYAOn-ACmUa6tGSWA_1740071168 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 539151800374; Thu, 20 Feb 2025 17:06:08 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 549681800D9B; Thu, 20 Feb 2025 17:06:06 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Sean Christopherson , Isaku Yamahata , Kai Huang , Binbin Wu , Yuan Yao , Dave Hansen Subject: [PATCH 01/30] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Date: Thu, 20 Feb 2025 12:05:35 -0500 Message-ID: <20250220170604.2279312-2-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Rick Edgecombe Intel TDX protects guest VMs from malicious host and certain physical attacks. Pre-TDX Intel hardware has support for a memory encryption architecture called MK-TME, which repurposes several high bits of physical address as "KeyID". TDX ends up with reserving a sub-range of MK-TME KeyIDs as "TDX private KeyIDs". Like MK-TME, these KeyIDs can be associated with an ephemeral key. For TDX this association is done by the TDX module. It also has its own tracking for which KeyIDs are in use. To do this ephemeral key setup and manipulate the TDX module's internal tracking, KVM will use the following SEAMCALLs: TDH.MNG.KEY.CONFIG: Mark the KeyID as in use, and initialize its ephemeral key. TDH.MNG.KEY.FREEID: Mark the KeyID as not in use. These SEAMCALLs both operate on TDR structures, which are setup using the previously added TDH.MNG.CREATE SEAMCALL. KVM's use of these operations will go like: - tdx_guest_keyid_alloc() - Initialize TD and TDR page with TDH.MNG.CREATE (not yet-added), passing KeyID - TDH.MNG.KEY.CONFIG to initialize the key - TD runs, teardown is started - TDH.MNG.KEY.FREEID - tdx_guest_keyid_free() Don't try to combine the tdx_guest_keyid_alloc() and TDH.MNG.KEY.CONFIG operations because TDH.MNG.CREATE and some locking need to be done in the middle. Don't combine TDH.MNG.KEY.FREEID and tdx_guest_keyid_free() so they are symmetrical with the creation path. So implement tdh_mng_key_config() and tdh_mng_key_freeid() as separate functions than tdx_guest_keyid_alloc() and tdx_guest_keyid_free(). The TDX module provides SEAMCALLs to hand pages to the TDX module for storing TDX controlled state. SEAMCALLs that operate on this state are directed to the appropriate TD VM using references to the pages originally provided for managing the TD's state. So the host kernel needs to track these pages, both as an ID for specifying which TD to operate on, and to allow them to be eventually reclaimed. The TD VM associated pages are called TDR (Trust Domain Root) and TDCS (Trust Domain Control Structure). Introduce "struct tdx_td" for holding references to pages provided to the TDX module for this TD VM associated state. Don't plan for any TD associated state that is controlled by KVM to live in this struct. Only expect it to hold data for concepts specific to the TDX architecture, for which there can't already be preexisting storage for in KVM. Add both the TDR page and an array of TDCS pages, even though the SEAMCALL wrappers will only need to know about the TDR pages for directing the SEAMCALLs to the right TD. Adding the TDCS pages to this struct will let all of the TD VM associated pages handed to the TDX module be tracked in one location. For a type to specify physical pages, use KVM's hpa_t type. Do this for KVM's benefit This is the common type used to hold physical addresses in KVM, so will make interoperability easier. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Yuan Yao Message-ID: <20241203010317.827803-2-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 12 ++++++++++++ arch/x86/virt/vmx/tdx/tdx.c | 25 +++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.h | 16 +++++++++------- 3 files changed, 46 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index b4b16dafd55e..5950e0a092ca 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -119,6 +119,18 @@ static inline u64 sc_retry(sc_func_t func, u64 fn, int tdx_cpu_enable(void); int tdx_enable(void); const char *tdx_dump_mce_info(struct mce *m); + +struct tdx_td { + /* TD root structure: */ + struct page *tdr_page; + + int tdcs_nr_pages; + /* TD control structure: */ + struct page **tdcs_pages; +}; + +u64 tdh_mng_key_config(struct tdx_td *td); +u64 tdh_mng_key_freeid(struct tdx_td *td); #else static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 7fdb37387886..1ffbdb840004 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1456,3 +1456,28 @@ void __init tdx_init(void) check_tdx_erratum(); } + +static inline u64 tdx_tdr_pa(struct tdx_td *td) +{ + return page_to_phys(td->tdr_page); +} + +u64 tdh_mng_key_config(struct tdx_td *td) +{ + struct tdx_module_args args = { + .rcx = tdx_tdr_pa(td), + }; + + return seamcall(TDH_MNG_KEY_CONFIG, &args); +} +EXPORT_SYMBOL_GPL(tdh_mng_key_config); + +u64 tdh_mng_key_freeid(struct tdx_td *td) +{ + struct tdx_module_args args = { + .rcx = tdx_tdr_pa(td), + }; + + return seamcall(TDH_MNG_KEY_FREEID, &args); +} +EXPORT_SYMBOL_GPL(tdh_mng_key_freeid); diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 4e3d533cdd61..5579317f67ab 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -15,13 +15,15 @@ /* * TDX module SEAMCALL leaf functions */ -#define TDH_PHYMEM_PAGE_RDMD 24 -#define TDH_SYS_KEY_CONFIG 31 -#define TDH_SYS_INIT 33 -#define TDH_SYS_RD 34 -#define TDH_SYS_LP_INIT 35 -#define TDH_SYS_TDMR_INIT 36 -#define TDH_SYS_CONFIG 45 +#define TDH_MNG_KEY_CONFIG 8 +#define TDH_MNG_KEY_FREEID 20 +#define TDH_PHYMEM_PAGE_RDMD 24 +#define TDH_SYS_KEY_CONFIG 31 +#define TDH_SYS_INIT 33 +#define TDH_SYS_RD 34 +#define TDH_SYS_LP_INIT 35 +#define TDH_SYS_TDMR_INIT 36 +#define TDH_SYS_CONFIG 45 /* TDX page types */ #define PT_NDA 0x0 From patchwork Thu Feb 20 17:05:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984299 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 172552135A3 for ; Thu, 20 Feb 2025 17:06:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071179; cv=none; b=jnXx6eCxhuEup5973l9vVd8o+DxtoDqPJmT18nO1XFzLEzQkVk5bSzjN3EB4E9j0eIolorOe8tzzfPZp1EGOOagi3s8pV9GTSbyRkoE0Y3kAUTSe7jBaK7jSh+/FG0WRiYmTio0cMlt40qCRDCollTgrEok6ODMa6ha9u/84HQc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071179; c=relaxed/simple; bh=YqJ2slmVYEuXEu6qIJT9H6vsKZoEST1bf8OMogekzxM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NtASqCdNa6VhUkwUQ0DUa2QTHONAh8jc1vAIlSQ7L9s8Syw3P1OUW4DLFyNk4yJQjWC4HDIs2IqmUEeomtq3eGLV48VmxxkTG2oh6ztH2MKGfspKmVmFk5lWWX0ANMV11YCE95XU60s+X1054mSdXnEu7JVs6oD1G8GKQBZEmDQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=GqmoyYf/; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GqmoyYf/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071176; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=31EDdwbA0MZObp650CjKk84NeS7DXwcBl5xYLzhl8+c=; b=GqmoyYf/V7+Oa/R6eTm3iR+K3k27zdEGSoG/295qVLjWBY3cOSxPufSHaPO9rV0PiSZ/ag +Bdx/0AaR3l8SMksdX20SsTEEOABwat/CmaJQdbgvcc+pEJgXXHyvVmotGdOac/MvYJr2Z 3PIfxpqAFfJaiNh2L2g9JVHs5zrTCKc= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-657-_BW29Fw7N-CKzY1FRR5HYg-1; Thu, 20 Feb 2025 12:06:12 -0500 X-MC-Unique: _BW29Fw7N-CKzY1FRR5HYg-1 X-Mimecast-MFC-AGG-ID: _BW29Fw7N-CKzY1FRR5HYg_1740071170 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4F814180087B; Thu, 20 Feb 2025 17:06:10 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8A98C1800359; Thu, 20 Feb 2025 17:06:08 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Sean Christopherson , Isaku Yamahata , Kai Huang , Binbin Wu , Yuan Yao , Dave Hansen Subject: [PATCH 02/30] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Date: Thu, 20 Feb 2025 12:05:36 -0500 Message-ID: <20250220170604.2279312-3-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Rick Edgecombe Intel TDX protects guest VMs from malicious hosts and certain physical attacks. It defines various control structures that hold state for things like TDs or vCPUs. These control structures are stored in pages given to the TDX module and encrypted with either the global KeyID or the guest KeyIDs. To manipulate these control structures the TDX module defines a few SEAMCALLs. KVM will use these during the process of creating a TD as follows: 1) Allocate a unique TDX KeyID for a new guest. 1) Call TDH.MNG.CREATE to create a "TD Root" (TDR) page, together with the new allocated KeyID. Unlike the rest of the TDX guest, the TDR page is crypto-protected by the 'global KeyID'. 2) Call the previously added TDH.MNG.KEY.CONFIG on each package to configure the KeyID for the guest. After this step, the KeyID to protect the guest is ready and the rest of the guest will be protected by this KeyID. 3) Call TDH.MNG.ADDCX to add TD Control Structure (TDCS) pages. 4) Call TDH.MNG.INIT to initialize the TDCS. To reclaim these pages for use by the kernel other SEAMCALLs are needed, which will be added in future patches. Add tdh_mng_addcx(), tdh_mng_create() and tdh_mng_init() to export these SEAMCALLs so that KVM can use them to create TDs. For SEAMCALLs that give a page to the TDX module to be encrypted, CLFLUSH the page mapped with KeyID 0, such that any dirty cache lines don't write back later and clobber TD memory or control structures. Don't worry about the other MK-TME KeyIDs because the kernel doesn't use them. The TDX docs specify that this flush is not needed unless the TDX module exposes the CLFLUSH_BEFORE_ALLOC feature bit. Be conservative and always flush. Add a helper function to facilitate this. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Yuan Yao Message-ID: <20241203010317.827803-3-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 3 +++ arch/x86/virt/vmx/tdx/tdx.c | 51 +++++++++++++++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.h | 3 +++ 3 files changed, 57 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 5950e0a092ca..6ba3b806e880 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -129,8 +129,11 @@ struct tdx_td { struct page **tdcs_pages; }; +u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page); u64 tdh_mng_key_config(struct tdx_td *td); +u64 tdh_mng_create(struct tdx_td *td, u16 hkid); u64 tdh_mng_key_freeid(struct tdx_td *td); +u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err); #else static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 1ffbdb840004..ce4b1e96c5b0 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1462,6 +1462,29 @@ static inline u64 tdx_tdr_pa(struct tdx_td *td) return page_to_phys(td->tdr_page); } +/* + * The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether + * a CLFLUSH of pages is required before handing them to the TDX module. + * Be conservative and make the code simpler by doing the CLFLUSH + * unconditionally. + */ +static void tdx_clflush_page(struct page *page) +{ + clflush_cache_range(page_to_virt(page), PAGE_SIZE); +} + +u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page) +{ + struct tdx_module_args args = { + .rcx = page_to_phys(tdcs_page), + .rdx = tdx_tdr_pa(td), + }; + + tdx_clflush_page(tdcs_page); + return seamcall(TDH_MNG_ADDCX, &args); +} +EXPORT_SYMBOL_GPL(tdh_mng_addcx); + u64 tdh_mng_key_config(struct tdx_td *td) { struct tdx_module_args args = { @@ -1472,6 +1495,18 @@ u64 tdh_mng_key_config(struct tdx_td *td) } EXPORT_SYMBOL_GPL(tdh_mng_key_config); +u64 tdh_mng_create(struct tdx_td *td, u16 hkid) +{ + struct tdx_module_args args = { + .rcx = tdx_tdr_pa(td), + .rdx = hkid, + }; + + tdx_clflush_page(td->tdr_page); + return seamcall(TDH_MNG_CREATE, &args); +} +EXPORT_SYMBOL_GPL(tdh_mng_create); + u64 tdh_mng_key_freeid(struct tdx_td *td) { struct tdx_module_args args = { @@ -1481,3 +1516,19 @@ u64 tdh_mng_key_freeid(struct tdx_td *td) return seamcall(TDH_MNG_KEY_FREEID, &args); } EXPORT_SYMBOL_GPL(tdh_mng_key_freeid); + +u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err) +{ + struct tdx_module_args args = { + .rcx = tdx_tdr_pa(td), + .rdx = td_params, + }; + u64 ret; + + ret = seamcall_ret(TDH_MNG_INIT, &args); + + *extended_err = args.rcx; + + return ret; +} +EXPORT_SYMBOL_GPL(tdh_mng_init); diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 5579317f67ab..0861c3f09576 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -15,8 +15,11 @@ /* * TDX module SEAMCALL leaf functions */ +#define TDH_MNG_ADDCX 1 #define TDH_MNG_KEY_CONFIG 8 +#define TDH_MNG_CREATE 9 #define TDH_MNG_KEY_FREEID 20 +#define TDH_MNG_INIT 21 #define TDH_PHYMEM_PAGE_RDMD 24 #define TDH_SYS_KEY_CONFIG 31 #define TDH_SYS_INIT 33 From patchwork Thu Feb 20 17:05:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984301 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 175E7212D66 for ; Thu, 20 Feb 2025 17:06:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071182; cv=none; b=fbadyJZQBRKhcjiYJxyrmTKGk6QuUGUym/G+zgWsHcb4G3xHn1Y5X1Hq+I/Ts0jt7dn/JlDkK/rB0W+B0RaAj24OZ1qWc0PpG66gYiZyZMScE2nFms3Gcec/XEIom/wbGeq3CADHwhGAd2+i/UybX0Rkbb+jWaenPMVGx3sBBFc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071182; c=relaxed/simple; bh=C11cFM17yzoQCRgS9TdGJ9DHeg6lvLv0JX89cC8cj/U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=f2Lqcf/RkMos6RSPaDbezm44a5FMNezZtizuRjy2pZQMp2Jphnf4g1Caod0esT4jI5Ym9i7JKOJFd22TYifG0GMoN//n+fJGGEwGIpjxma9fKD8gaawtLhdWUASo0Bejz+Lmp7vD4qJZwtxGi52IJ/GVsLpfOXuhxpBit86l42U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=e5eJENp2; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="e5eJENp2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071178; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hB4wLHar9EB7VrqR/GY4wk+BmJM+LvI82FrpGT858TU=; b=e5eJENp2N0VEePYTZteYouwnQFz5WNJM538mtuG5dRKGHXRAxOYhKf29vkrYLo8q4G+rrD m4YVDz4tCOETxEIJaaMlvAveXYE5ZC0+Fu3mmqvTk0YlmQ1RxqlZIsbsO2WZXv+x4zvtR1 FtPLb2RPpXy8Cpx+oO75rEj5rnS3n1Y= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-152-V34fs54nO9ONpxb5hS7c6w-1; Thu, 20 Feb 2025 12:06:14 -0500 X-MC-Unique: V34fs54nO9ONpxb5hS7c6w-1 X-Mimecast-MFC-AGG-ID: V34fs54nO9ONpxb5hS7c6w_1740071172 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3F2FC1800875; Thu, 20 Feb 2025 17:06:12 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 853D31800D9E; Thu, 20 Feb 2025 17:06:10 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Sean Christopherson , Isaku Yamahata , Kai Huang , Binbin Wu , Yuan Yao , Dave Hansen Subject: [PATCH 03/30] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Date: Thu, 20 Feb 2025 12:05:37 -0500 Message-ID: <20250220170604.2279312-4-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Rick Edgecombe Intel TDX protects guest VMs from malicious host and certain physical attacks. It defines various control structures that hold state for virtualized components of the TD (i.e. VMs or vCPUs) These control structures are stored in pages given to the TDX module and encrypted with either the global KeyID or the guest KeyIDs. To manipulate these control structures the TDX module defines a few SEAMCALLs. KVM will use these during the process of creating a vCPU as follows: 1) Call TDH.VP.CREATE to create a TD vCPU Root (TDVPR) page for each vCPU. 2) Call TDH.VP.ADDCX to add per-vCPU control pages (TDCX) for each vCPU. 3) Call TDH.VP.INIT to initialize the TDCX for each vCPU. To reclaim these pages for use by the kernel other SEAMCALLs are needed, which will be added in future patches. Export functions to allow KVM to make these SEAMCALLs. Export two variants for TDH.VP.CREATE, in order to support the planned logic of KVM to support TDX modules with and without the ENUM_TOPOLOGY feature. If KVM can drop support for the !ENUM_TOPOLOGY case, this could go down a single version. Leave that for later discussion. The TDX module provides SEAMCALLs to hand pages to the TDX module for storing TDX controlled state. SEAMCALLs that operate on this state are directed to the appropriate TD vCPU using references to the pages originally provided for managing the vCPU's state. So the host kernel needs to track these pages, both as an ID for specifying which vCPU to operate on, and to allow them to be eventually reclaimed. The vCPU associated pages are called TDVPR (Trust Domain Virtual Processor Root) and TDCX (Trust Domain Control Extension). Introduce "struct tdx_vp" for holding references to pages provided to the TDX module for the TD vCPU associated state. Don't plan for any vCPU associated state that is controlled by KVM to live in this struct. Only expect it to hold data for concepts specific to the TDX architecture, for which there can't already be preexisting storage for in KVM. Add both the TDVPR page and an array of TDCX pages, even though the SEAMCALL wrappers will only need to know about the TDVPR pages for directing the SEAMCALLs to the right vCPU. Adding the TDCX pages to this struct will let all of the vCPU associated pages handed to the TDX module be tracked in one location. For a type to specify physical pages, use KVM's hpa_t type. Do this for KVM's benefit This is the common type used to hold physical addresses in KVM, so will make interoperability easier. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Yuan Yao Message-ID: <20241203010317.827803-4-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 14 ++++++++++++ arch/x86/virt/vmx/tdx/tdx.c | 43 +++++++++++++++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.h | 11 ++++++++++ 3 files changed, 68 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 6ba3b806e880..f783d5f1a0e1 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -127,13 +127,27 @@ struct tdx_td { int tdcs_nr_pages; /* TD control structure: */ struct page **tdcs_pages; + + /* Size of `tdcx_pages` in struct tdx_vp */ + int tdcx_nr_pages; +}; + +struct tdx_vp { + /* TDVP root page */ + struct page *tdvpr_page; + + /* TD vCPU control structure: */ + struct page **tdcx_pages; }; u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page); +u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page); u64 tdh_mng_key_config(struct tdx_td *td); u64 tdh_mng_create(struct tdx_td *td, u16 hkid); +u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp); u64 tdh_mng_key_freeid(struct tdx_td *td); u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err); +u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid); #else static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index ce4b1e96c5b0..e55570575f22 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -5,6 +5,7 @@ * Intel Trusted Domain Extensions (TDX) support */ +#include "asm/page_types.h" #define pr_fmt(fmt) "virt/tdx: " fmt #include @@ -1462,6 +1463,11 @@ static inline u64 tdx_tdr_pa(struct tdx_td *td) return page_to_phys(td->tdr_page); } +static inline u64 tdx_tdvpr_pa(struct tdx_vp *td) +{ + return page_to_phys(td->tdvpr_page); +} + /* * The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether * a CLFLUSH of pages is required before handing them to the TDX module. @@ -1485,6 +1491,18 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page) } EXPORT_SYMBOL_GPL(tdh_mng_addcx); +u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page) +{ + struct tdx_module_args args = { + .rcx = page_to_phys(tdcx_page), + .rdx = tdx_tdvpr_pa(vp), + }; + + tdx_clflush_page(tdcx_page); + return seamcall(TDH_VP_ADDCX, &args); +} +EXPORT_SYMBOL_GPL(tdh_vp_addcx); + u64 tdh_mng_key_config(struct tdx_td *td) { struct tdx_module_args args = { @@ -1507,6 +1525,18 @@ u64 tdh_mng_create(struct tdx_td *td, u16 hkid) } EXPORT_SYMBOL_GPL(tdh_mng_create); +u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp) +{ + struct tdx_module_args args = { + .rcx = tdx_tdvpr_pa(vp), + .rdx = tdx_tdr_pa(td), + }; + + tdx_clflush_page(vp->tdvpr_page); + return seamcall(TDH_VP_CREATE, &args); +} +EXPORT_SYMBOL_GPL(tdh_vp_create); + u64 tdh_mng_key_freeid(struct tdx_td *td) { struct tdx_module_args args = { @@ -1532,3 +1562,16 @@ u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err) return ret; } EXPORT_SYMBOL_GPL(tdh_mng_init); + +u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid) +{ + struct tdx_module_args args = { + .rcx = tdx_tdvpr_pa(vp), + .rdx = initial_rcx, + .r8 = x2apicid, + }; + + /* apicid requires version == 1. */ + return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args); +} +EXPORT_SYMBOL_GPL(tdh_vp_init); diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 0861c3f09576..f0464f7d9780 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -16,10 +16,13 @@ * TDX module SEAMCALL leaf functions */ #define TDH_MNG_ADDCX 1 +#define TDH_VP_ADDCX 4 #define TDH_MNG_KEY_CONFIG 8 #define TDH_MNG_CREATE 9 +#define TDH_VP_CREATE 10 #define TDH_MNG_KEY_FREEID 20 #define TDH_MNG_INIT 21 +#define TDH_VP_INIT 22 #define TDH_PHYMEM_PAGE_RDMD 24 #define TDH_SYS_KEY_CONFIG 31 #define TDH_SYS_INIT 33 @@ -28,6 +31,14 @@ #define TDH_SYS_TDMR_INIT 36 #define TDH_SYS_CONFIG 45 +/* + * SEAMCALL leaf: + * + * Bit 15:0 Leaf number + * Bit 23:16 Version number + */ +#define TDX_VERSION_SHIFT 16 + /* TDX page types */ #define PT_NDA 0x0 #define PT_RSVD 0x1 From patchwork Thu Feb 20 17:05:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984300 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 383ED2135B1 for ; Thu, 20 Feb 2025 17:06:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071180; cv=none; b=o0d7D6YxZASz9Rd1nwwb2xtP9YGM0BDOznzfokfZJ9t0/Z1npiL7VdWOP19sDpw5dCsDo0yOeE10KBwWeG/7wgnO1HSU0VlVcVgk1a1Pm0FBIRE5JZFCEhqDje6woDkZWyu6rMo0Csb2SgMGU0TticjhdW2CSdciRisYGCyMryg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071180; c=relaxed/simple; bh=wY1Pb1vPVwTQHEMlUj9j3SJ4mM5BlX9vgSFEdokefKo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uE4Qb2KFq+C72AiYl9E19YtSgTgs/Di+s+A8IxVMs67+2V00k739pAaJTPI9pqvar141oHQPFCNqBrxVN2lJQgWlcUdEf+GsKKanESifCUan1RXq0x/NbfNkdVm2YAQbU+qxRiGBBN0VEJtWy4YRF6cG+5YNPf/RmhHhtNUWsWI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=L8PL2nUO; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="L8PL2nUO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071177; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oCnObeshrys+iNedWnjUTWZfkadynDg4c9xMBwTmFtA=; b=L8PL2nUOP0hqWLBW6TyGxl6wYO5bn6k8vwZwitzp5Fu51f4CrFJdy+cTpIFf6NGowy5Jqy MOjGg+rWJBlpAjCENc/veUXhmm4XYz9OA4rvnYSN33tb6fsbqfImjF4csUxjq0+M5a0/BY 9ZFpq6OozqiN8uEgSFUIfOWqrhYInOs= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-693-0jWam95WNMmDQLN_1nWGVw-1; Thu, 20 Feb 2025 12:06:15 -0500 X-MC-Unique: 0jWam95WNMmDQLN_1nWGVw-1 X-Mimecast-MFC-AGG-ID: 0jWam95WNMmDQLN_1nWGVw_1740071174 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 08BCA18D95F8; Thu, 20 Feb 2025 17:06:14 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 75F441800DA0; Thu, 20 Feb 2025 17:06:12 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Sean Christopherson , Isaku Yamahata , Kai Huang , Binbin Wu , Yuan Yao , Dave Hansen Subject: [PATCH 04/30] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Date: Thu, 20 Feb 2025 12:05:38 -0500 Message-ID: <20250220170604.2279312-5-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Rick Edgecombe Intel TDX protects guest VMs from malicious host and certain physical attacks. The TDX module uses pages provided by the host for both control structures and for TD guest pages. These pages are encrypted using the MK-TME encryption engine, with its special requirements around cache invalidation. For its own security, the TDX module ensures pages are flushed properly and track which usage they are currently assigned. For creating and tearing down TD VMs and vCPUs KVM will need to use the TDH.PHYMEM.PAGE.RECLAIM, TDH.PHYMEM.CACHE.WB, and TDH.PHYMEM.PAGE.WBINVD SEAMCALLs. Add tdh_phymem_page_reclaim() to enable KVM to call TDH.PHYMEM.PAGE.RECLAIM to reclaim the page for use by the host kernel. This effectively resets its state in the TDX module's page tracking (PAMT), if the page is available to be reclaimed. This will be used by KVM to reclaim the various types of pages owned by the TDX module. It will have a small wrapper in KVM that retries in the case of a relevant error code. Don't implement this wrapper in arch/x86 because KVM's solution around retrying SEAMCALLs will be better located in a single place. Add tdh_phymem_cache_wb() to enable KVM to call TDH.PHYMEM.CACHE.WB to do a cache write back in a way that the TDX module can verify, before it allows a KeyID to be freed. The KVM code will use this to have a small wrapper that handles retries. Since the TDH.PHYMEM.CACHE.WB operation is interruptible, have tdh_phymem_cache_wb() take a resume argument to pass this info to the TDX module for restarts. It is worth noting that this SEAMCALL uses a SEAM specific MSR to do the write back in sections. In this way it does export some new functionality that affects CPU state. Add tdh_phymem_page_wbinvd_tdr() to enable KVM to call TDH.PHYMEM.PAGE.WBINVD to do a cache write back and invalidate of a TDR, using the global KeyID. The underlying TDH.PHYMEM.PAGE.WBINVD SEAMCALL requires the related KeyID to be encoded into the SEAMCALL args. Since the global KeyID is not exposed to KVM, a dedicated wrapper is needed for TDR focused TDH.PHYMEM.PAGE.WBINVD operations. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Yuan Yao Message-ID: <20241203010317.827803-5-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 17 +++++++++++++++ arch/x86/virt/vmx/tdx/tdx.c | 42 +++++++++++++++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.h | 3 +++ 3 files changed, 62 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index f783d5f1a0e1..ea26d79ec9d9 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -33,6 +33,7 @@ #ifndef __ASSEMBLY__ #include +#include /* * Used by the #VE exception handler to gather the #VE exception @@ -140,6 +141,19 @@ struct tdx_vp { struct page **tdcx_pages; }; + +static inline u64 mk_keyed_paddr(u16 hkid, struct page *page) +{ + u64 ret; + + ret = page_to_phys(page); + /* KeyID bits are just above the physical address bits: */ + ret |= (u64)hkid << boot_cpu_data.x86_phys_bits; + + return ret; + +} + u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page); u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page); u64 tdh_mng_key_config(struct tdx_td *td); @@ -148,6 +162,9 @@ u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp); u64 tdh_mng_key_freeid(struct tdx_td *td); u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err); u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid); +u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size); +u64 tdh_phymem_cache_wb(bool resume); +u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td); #else static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index e55570575f22..2350d25b4ca3 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1575,3 +1575,45 @@ u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid) return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args); } EXPORT_SYMBOL_GPL(tdh_vp_init); + +/* + * TDX ABI defines output operands as PT, OWNER and SIZE. These are TDX defined fomats. + * So despite the names, they must be interpted specially as described by the spec. Return + * them only for error reporting purposes. + */ +u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size) +{ + struct tdx_module_args args = { + .rcx = page_to_phys(page), + }; + u64 ret; + + ret = seamcall_ret(TDH_PHYMEM_PAGE_RECLAIM, &args); + + *tdx_pt = args.rcx; + *tdx_owner = args.rdx; + *tdx_size = args.r8; + + return ret; +} +EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim); + +u64 tdh_phymem_cache_wb(bool resume) +{ + struct tdx_module_args args = { + .rcx = resume ? 1 : 0, + }; + + return seamcall(TDH_PHYMEM_CACHE_WB, &args); +} +EXPORT_SYMBOL_GPL(tdh_phymem_cache_wb); + +u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td) +{ + struct tdx_module_args args = {}; + + args.rcx = mk_keyed_paddr(tdx_global_keyid, td->tdr_page); + + return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args); +} +EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr); diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index f0464f7d9780..7a15c9afcdfa 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -24,11 +24,14 @@ #define TDH_MNG_INIT 21 #define TDH_VP_INIT 22 #define TDH_PHYMEM_PAGE_RDMD 24 +#define TDH_PHYMEM_PAGE_RECLAIM 28 #define TDH_SYS_KEY_CONFIG 31 #define TDH_SYS_INIT 33 #define TDH_SYS_RD 34 #define TDH_SYS_LP_INIT 35 #define TDH_SYS_TDMR_INIT 36 +#define TDH_PHYMEM_CACHE_WB 40 +#define TDH_PHYMEM_PAGE_WBINVD 41 #define TDH_SYS_CONFIG 45 /* From patchwork Thu Feb 20 17:05:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984302 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 95A112139C6 for ; Thu, 20 Feb 2025 17:06:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071182; cv=none; b=akqK0xCk1awNXkLG+vVIUIu8/drIotZ+V7luNcTB6k9nBDaFpKzLuMNERsWo5gK+qzA/ffI/Jp6re7I2S7nFa+W+oYljsl1UmalhXcIZdu7VmKzbc7lHmMDC/kqAZzqKaPfq5U7GIjX1/qOPEEyuHJ+TgFCW3Fw73vGEOsuLfB8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071182; c=relaxed/simple; bh=w5Y3laFWosTA2HphWOt7dFXmNJq5mupL3C0c3qcZd0o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YMRAU/iRi+xr089YyZ4+WlIdKmdw2c+nX8tDCqxFiLKMUKtLhjJbmMxev8eFl5OuCzFPEDf6P/C4YDEJ+BHhl0qn6qBSIhtaGenGtNtnRtbKiT8GiXwV6FuDvRnAVIAbnK6sPL+20rCvWjm9qwvmQwKBHZHDvbq07PIz5yvtkus= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CbKujg0f; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CbKujg0f" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071179; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wSZ0jETfnfwQLC1g7Ho7TibVNz56yfN4Sdo/oZXpN6E=; b=CbKujg0fkEL4YsEPogb3uDGo46QC6iiovtEdtz5U6/yERXy0c7XjlcpJtDwD4YrdMR3Z6E jhpbXV4XzpwRrh9OGJ0OLt81O/zE/10yBVeqJ0cWH0MX71PT9De+GhTLBrGCoB0LljGGWM 8fqk8ZIfgFL1HAAYqaHyvdgnrjtFgZo= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-678-HHxKl5aZMrK9oGu-t9qofw-1; Thu, 20 Feb 2025 12:06:17 -0500 X-MC-Unique: HHxKl5aZMrK9oGu-t9qofw-1 X-Mimecast-MFC-AGG-ID: HHxKl5aZMrK9oGu-t9qofw_1740071175 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C03151800991; Thu, 20 Feb 2025 17:06:15 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3D3C01800362; Thu, 20 Feb 2025 17:06:14 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Sean Christopherson , Isaku Yamahata , Kai Huang , Binbin Wu , Yuan Yao , Dave Hansen Subject: [PATCH 05/30] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Date: Thu, 20 Feb 2025 12:05:39 -0500 Message-ID: <20250220170604.2279312-6-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Rick Edgecombe Intel TDX protects guest VMs from malicious host and certain physical attacks. The TDX module has TD scoped and vCPU scoped "metadata fields". These fields are a bit like VMCS fields, and stored in data structures maintained by the TDX module. Export 3 SEAMCALLs for use in reading and writing these fields: Make tdh_mng_rd() use MNG.VP.RD to read the TD scoped metadata. Make tdh_vp_rd()/tdh_vp_wr() use TDH.VP.RD/WR to read/write the vCPU scoped metadata. KVM will use these by creating inline helpers that target various metadata sizes. Export the raw SEAMCALL leaf, to avoid exporting the large number of various sized helpers. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Yuan Yao Acked-by: Dave Hansen Message-ID: <20241203010317.827803-6-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 3 +++ arch/x86/virt/vmx/tdx/tdx.c | 47 +++++++++++++++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.h | 3 +++ 3 files changed, 53 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index ea26d79ec9d9..cdb7dc1c0339 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -159,9 +159,12 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page); u64 tdh_mng_key_config(struct tdx_td *td); u64 tdh_mng_create(struct tdx_td *td, u16 hkid); u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp); +u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data); u64 tdh_mng_key_freeid(struct tdx_td *td); u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err); u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid); +u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data); +u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask); u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size); u64 tdh_phymem_cache_wb(bool resume); u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td); diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 2350d25b4ca3..93150f3c1112 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1537,6 +1537,23 @@ u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp) } EXPORT_SYMBOL_GPL(tdh_vp_create); +u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data) +{ + struct tdx_module_args args = { + .rcx = tdx_tdr_pa(td), + .rdx = field, + }; + u64 ret; + + ret = seamcall_ret(TDH_MNG_RD, &args); + + /* R8: Content of the field, or 0 in case of error. */ + *data = args.r8; + + return ret; +} +EXPORT_SYMBOL_GPL(tdh_mng_rd); + u64 tdh_mng_key_freeid(struct tdx_td *td) { struct tdx_module_args args = { @@ -1563,6 +1580,36 @@ u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err) } EXPORT_SYMBOL_GPL(tdh_mng_init); +u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data) +{ + struct tdx_module_args args = { + .rcx = tdx_tdvpr_pa(vp), + .rdx = field, + }; + u64 ret; + + ret = seamcall_ret(TDH_VP_RD, &args); + + /* R8: Content of the field, or 0 in case of error. */ + *data = args.r8; + + return ret; +} +EXPORT_SYMBOL_GPL(tdh_vp_rd); + +u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask) +{ + struct tdx_module_args args = { + .rcx = tdx_tdvpr_pa(vp), + .rdx = field, + .r8 = data, + .r9 = mask, + }; + + return seamcall(TDH_VP_WR, &args); +} +EXPORT_SYMBOL_GPL(tdh_vp_wr); + u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid) { struct tdx_module_args args = { diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 7a15c9afcdfa..aacd38b12989 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -19,11 +19,13 @@ #define TDH_VP_ADDCX 4 #define TDH_MNG_KEY_CONFIG 8 #define TDH_MNG_CREATE 9 +#define TDH_MNG_RD 11 #define TDH_VP_CREATE 10 #define TDH_MNG_KEY_FREEID 20 #define TDH_MNG_INIT 21 #define TDH_VP_INIT 22 #define TDH_PHYMEM_PAGE_RDMD 24 +#define TDH_VP_RD 26 #define TDH_PHYMEM_PAGE_RECLAIM 28 #define TDH_SYS_KEY_CONFIG 31 #define TDH_SYS_INIT 33 @@ -32,6 +34,7 @@ #define TDH_SYS_TDMR_INIT 36 #define TDH_PHYMEM_CACHE_WB 40 #define TDH_PHYMEM_PAGE_WBINVD 41 +#define TDH_VP_WR 43 #define TDH_SYS_CONFIG 45 /* From patchwork Thu Feb 20 17:05:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984303 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B69292147ED for ; Thu, 20 Feb 2025 17:06:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071187; cv=none; b=SNuI5lFwcondygryDA8EGupVis7FT/ikI0FYoYHyEBZQm1f9cAdUo1PD2LprZuPxz7zH1Q/HdwzKvI5/88DLvZcGCm8cKhQhuHvyIMx1ya3BPl35GQU1dF6ovOfgd68ARTGhg46HSr0hFZS2MBm9Y1/Od6aIr2V0JHI1Tkv6xT8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071187; c=relaxed/simple; bh=ZyYtvE/bpCadBbzfb7NEGsdFr3A3of80r1AFMp+aQzc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ekozbz6jbhsn/nIR0AFvT+45xg4OOHHTI9IBykRbZO0nrH5NPpn/B5VjtDgLCBOAVoeKoQc+7PCorfR+LX6qfZWzayNC2izeMYrfJepN8qaUBCgdhS/pgoj+qwpFZpq709cVDj5XS1fo0KVkiT5E5BZBA+a3xC4L/0LCqNlogrw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Okc0b7UN; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Okc0b7UN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071184; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ptu02ZYnYaomgeeQl6FJG/TklDuuC38iUaib5eK/egQ=; b=Okc0b7UNiH6Ch01rPkKUGiyUiEPWRicsyVPdMAVR+5ICh1iEABh82S/SwLSlXhpJiEbvtT 4mgXNHF6OW8mvDXwRznyh8WE8IkrC1ya5UX7PoNUvHnb4hqd4aX87F3RAuQGXdXl5GF1rG XXAx208CMhxYkY4wMakvC9ND0smb1qo= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-44-EpvGtWBwP0OxPZOrmK7JnQ-1; Thu, 20 Feb 2025 12:06:19 -0500 X-MC-Unique: EpvGtWBwP0OxPZOrmK7JnQ-1 X-Mimecast-MFC-AGG-ID: EpvGtWBwP0OxPZOrmK7JnQ_1740071177 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 857991800878; Thu, 20 Feb 2025 17:06:17 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 01D1E180035F; Thu, 20 Feb 2025 17:06:15 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Sean Christopherson , Isaku Yamahata , Kai Huang , Binbin Wu , Yuan Yao , Dave Hansen Subject: [PATCH 06/30] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Date: Thu, 20 Feb 2025 12:05:40 -0500 Message-ID: <20250220170604.2279312-7-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Rick Edgecombe Intel TDX protects guest VMs from malicious host and certain physical attacks. The TDX module has the concept of flushing vCPUs. These flushes include both a flush of the translation caches and also any other state internal to the TDX module. Before freeing a KeyID, this flush operation needs to be done. KVM will need to perform the flush on each pCPU associated with the TD, and also perform a TD scoped operation that checks if the flush has been done on all vCPU's associated with the TD. Add a tdh_vp_flush() function to be used to call TDH.VP.FLUSH on each pCPU associated with the TD during TD teardown. It will also be called when disabling TDX and during vCPU migration between pCPUs. Add tdh_mng_vpflushdone() to be used by KVM to call TDH.MNG.VPFLUSHDONE. KVM will use this during TD teardown to verify that TDH.VP.FLUSH has been called sufficiently, and advance the state machine that will allow for reclaiming the TD's KeyID. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Yuan Yao Message-ID: <20241203010317.827803-7-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 2 ++ arch/x86/virt/vmx/tdx/tdx.c | 20 ++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.h | 2 ++ 3 files changed, 24 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index cdb7dc1c0339..53cdb4896856 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -160,6 +160,8 @@ u64 tdh_mng_key_config(struct tdx_td *td); u64 tdh_mng_create(struct tdx_td *td, u16 hkid); u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp); u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data); +u64 tdh_vp_flush(struct tdx_vp *vp); +u64 tdh_mng_vpflushdone(struct tdx_td *td); u64 tdh_mng_key_freeid(struct tdx_td *td); u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err); u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid); diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 93150f3c1112..369b264f4d9b 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1554,6 +1554,26 @@ u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data) } EXPORT_SYMBOL_GPL(tdh_mng_rd); +u64 tdh_vp_flush(struct tdx_vp *vp) +{ + struct tdx_module_args args = { + .rcx = tdx_tdvpr_pa(vp), + }; + + return seamcall(TDH_VP_FLUSH, &args); +} +EXPORT_SYMBOL_GPL(tdh_vp_flush); + +u64 tdh_mng_vpflushdone(struct tdx_td *td) +{ + struct tdx_module_args args = { + .rcx = tdx_tdr_pa(td), + }; + + return seamcall(TDH_MNG_VPFLUSHDONE, &args); +} +EXPORT_SYMBOL_GPL(tdh_mng_vpflushdone); + u64 tdh_mng_key_freeid(struct tdx_td *td) { struct tdx_module_args args = { diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index aacd38b12989..62cb7832c42d 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -20,6 +20,8 @@ #define TDH_MNG_KEY_CONFIG 8 #define TDH_MNG_CREATE 9 #define TDH_MNG_RD 11 +#define TDH_VP_FLUSH 18 +#define TDH_MNG_VPFLUSHDONE 19 #define TDH_VP_CREATE 10 #define TDH_MNG_KEY_FREEID 20 #define TDH_MNG_INIT 21 From patchwork Thu Feb 20 17:05:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984304 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 070D02147F8 for ; Thu, 20 Feb 2025 17:06:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071187; cv=none; b=gw9cI2yl+2TjVZyeKQi/cAEB2Dexf3dIkQItH8+DoNAXkAAPqA+S77K38oiHuKwCWSm2bkAzuI6sW7GXj6zwvUYeyXiErCGLic+w/Gik/hzaXNCcWmQHGsCklQqRpe9qSvROkpZ7bvG3UVMKpSoUWvH40FL6NcCo2ZFlJDTtHko= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071187; c=relaxed/simple; bh=7o4MI2SWfvup5UD+qqNq6pDUDm+stiy6rOOP3Q809Qc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OOj5iSUJ0FTeb6SI8IvQyUA56JOiyv3n8rkpjQJ2J0+2QpP51S2edIXkYXiiV4ZGqv/1Gvpas8pM73VZfsIkUP+3Lq3YJYDVRAq64rCboPqOOW/twHJojc4Rxola23s3+1gY1LYGmkVhAfvk/GA6WXFCDRRv0trkOubAq7KOgrc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=V3CbTL3K; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="V3CbTL3K" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071185; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OKSKOdbW5lvB/N/H9qi5+SwkOKNYt+M8QNhPBs9dcO8=; b=V3CbTL3KEn0xipcA56S3eMDe6p3jBKi3h3alQXwVUd4dWoSvaPKGzLZf0FbR7Rb0nbdv+X ODz2CTVJCw5LLaHzs0P3+LUXP9jSb6s4JU/32mBCAaiKtFQWzSLRsljRaWkks9kGm45O1q WAdkaz10PjccRTORo4QxcPTtY/cLfaE= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-403-T9jSxB5VMDKS2aKHpEQtYw-1; Thu, 20 Feb 2025 12:06:19 -0500 X-MC-Unique: T9jSxB5VMDKS2aKHpEQtYw-1 X-Mimecast-MFC-AGG-ID: T9jSxB5VMDKS2aKHpEQtYw_1740071178 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id AFEDD19560B9; Thu, 20 Feb 2025 17:06:18 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id BABE81800D9D; Thu, 20 Feb 2025 17:06:17 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe Subject: [PATCH 07/30] x86/virt/tdx: allocate tdx_sys_info in static memory Date: Thu, 20 Feb 2025 12:05:41 -0500 Message-ID: <20250220170604.2279312-8-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Adding all the information that KVM needs increases the size of struct tdx_sys_info, to the point that you can get warnings about the stack size of init_tdx_module(). Since KVM also needs to read the TDX metadata after init_tdx_module() returns, make the variable a global. Signed-off-by: Paolo Bonzini Reviewed-by: Kai Huang Reviewed-by: Rick Edgecombe --- arch/x86/virt/vmx/tdx/tdx.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 369b264f4d9b..578ad468634b 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -53,6 +53,8 @@ static DEFINE_MUTEX(tdx_module_lock); /* All TDX-usable memory regions. Protected by mem_hotplug_lock. */ static LIST_HEAD(tdx_memlist); +static struct tdx_sys_info tdx_sysinfo; + typedef void (*sc_err_func_t)(u64 fn, u64 err, struct tdx_module_args *args); static inline void seamcall_err(u64 fn, u64 err, struct tdx_module_args *args) @@ -1061,15 +1063,14 @@ static int init_tdmrs(struct tdmr_info_list *tdmr_list) static int init_tdx_module(void) { - struct tdx_sys_info sysinfo; int ret; - ret = get_tdx_sys_info(&sysinfo); + ret = get_tdx_sys_info(&tdx_sysinfo); if (ret) return ret; /* Check whether the kernel can support this module */ - ret = check_features(&sysinfo); + ret = check_features(&tdx_sysinfo); if (ret) return ret; @@ -1090,12 +1091,12 @@ static int init_tdx_module(void) goto out_put_tdxmem; /* Allocate enough space for constructing TDMRs */ - ret = alloc_tdmr_list(&tdx_tdmr_list, &sysinfo.tdmr); + ret = alloc_tdmr_list(&tdx_tdmr_list, &tdx_sysinfo.tdmr); if (ret) goto err_free_tdxmem; /* Cover all TDX-usable memory regions in TDMRs */ - ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo.tdmr); + ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &tdx_sysinfo.tdmr); if (ret) goto err_free_tdmrs; From patchwork Thu Feb 20 17:05:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984306 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A892A214A94 for ; Thu, 20 Feb 2025 17:06:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071191; cv=none; b=kj/uOW3j834/3c8QROd6aGbdn430FWHTfJtZ4h+iZADpOc6XbBXPv9zR2QQQdYjTK4cylFkbAY1qLBeNT/HWHvMimKGutUHeYiQpcMqk7oWDzfC6EeWlb+MGmJzVKiUTlEqK61lOHq8W/nPXtQ/7u9WF1FnUrvz/5uOk9Dei/pw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071191; c=relaxed/simple; bh=FlixckBccX6FBZzeGPG8flgsnluZb2qoVWfxMxTS69g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ql3r5EBt38RWgI4qxYTKanun1SHEcuN/jlFxA8daIdJhR1TsEbo3SUTzwzw6xFJREXR/zdSQwIYH8hyDGXooMr2Vcya5su+5ri9EoAeajEOzV3UlWqdEzuMYZtCyCzz+GsQJBwPbEzXkX+uY4nLb5EbC6oK3U+zU/qLHJTAWItE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=I81jZ9+g; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="I81jZ9+g" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071186; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=496+NfmReiu1QtRFgMa3QzvQAaca1PTo3KmIjXjJBzw=; b=I81jZ9+gEhfq7fZd5T8FvmNZEcnXHMBVIevCPPlGWV5CeserS13F8Z1iwCgUPOoggZ1VQk Rb62/gEVc76U6WUzh0W1427dWDI4mRAhSeLFcuudf19uzkeHjg3NR02eNhhpFMG3Se7543 L2xh74NVc8yWv0/XelwzKQEmWCTsOvU= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-453-hEXvNwEDNEeGw98-7oanMA-1; Thu, 20 Feb 2025 12:06:21 -0500 X-MC-Unique: hEXvNwEDNEeGw98-7oanMA-1 X-Mimecast-MFC-AGG-ID: hEXvNwEDNEeGw98-7oanMA_1740071180 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DC927190308C; Thu, 20 Feb 2025 17:06:19 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C81711800359; Thu, 20 Feb 2025 17:06:18 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Kai Huang , Dave Hansen Subject: [PATCH 08/30] x86/virt/tdx: Read essential global metadata for KVM Date: Thu, 20 Feb 2025 12:05:42 -0500 Message-ID: <20250220170604.2279312-9-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Kai Huang KVM needs two classes of global metadata to create and run TDX guests: - "TD Control Structures" - "TD Configurability" The first class contains the sizes of TDX guest per-VM and per-vCPU control structures. KVM will need to use them to allocate enough space for those control structures. The second class contains info which reports things like which features are configurable to TDX guest etc. KVM will need to use them to properly configure TDX guests. Read them for KVM TDX to use. The code change is auto-generated by re-running the script in [1] after uncommenting the "td_conf" and "td_ctrl" part to regenerate the tdx_global_metadata.{hc} and update them to the existing ones in the kernel. #python tdx.py global_metadata.json tdx_global_metadata.h \ tdx_global_metadata.c The 'global_metadata.json' can be fetched from [2]. Note that as of this writing, the JSON file only allows a maximum of 32 CPUID entries. While this is enough for current contents of the CPUID leaves, there were plans to change the JSON per TDX module release which would change the ABI and potentially prevent future versions of the TDX module from working with older kernels. While discussions are ongoing with the TDX module team on what exactly constitutes an ABI breakage, in the meantime the TDX module team has agreed to not increase the number of CPUID entries beyond 128 without an opt in. Therefore the file was tweaked by hand to change the maximum number of CPUID_CONFIGs. Link: https://lore.kernel.org/kvm/0853b155ec9aac09c594caa60914ed6ea4dc0a71.camel@intel.com/ [1] Link: https://cdrdv2.intel.com/v1/dl/getContent/795381 [2] Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Message-ID: <20241030190039.77971-4-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 50 +++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx_global_metadata.h | 19 ++++++++ 2 files changed, 69 insertions(+) diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c index 8027a24d1c6e..13ad2663488b 100644 --- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c +++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c @@ -37,12 +37,62 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr) return ret; } +static int get_tdx_sys_info_td_ctrl(struct tdx_sys_info_td_ctrl *sysinfo_td_ctrl) +{ + int ret = 0; + u64 val; + + if (!ret && !(ret = read_sys_metadata_field(0x9800000100000000, &val))) + sysinfo_td_ctrl->tdr_base_size = val; + if (!ret && !(ret = read_sys_metadata_field(0x9800000100000100, &val))) + sysinfo_td_ctrl->tdcs_base_size = val; + if (!ret && !(ret = read_sys_metadata_field(0x9800000100000200, &val))) + sysinfo_td_ctrl->tdvps_base_size = val; + + return ret; +} + +static int get_tdx_sys_info_td_conf(struct tdx_sys_info_td_conf *sysinfo_td_conf) +{ + int ret = 0; + u64 val; + int i, j; + + if (!ret && !(ret = read_sys_metadata_field(0x1900000300000000, &val))) + sysinfo_td_conf->attributes_fixed0 = val; + if (!ret && !(ret = read_sys_metadata_field(0x1900000300000001, &val))) + sysinfo_td_conf->attributes_fixed1 = val; + if (!ret && !(ret = read_sys_metadata_field(0x1900000300000002, &val))) + sysinfo_td_conf->xfam_fixed0 = val; + if (!ret && !(ret = read_sys_metadata_field(0x1900000300000003, &val))) + sysinfo_td_conf->xfam_fixed1 = val; + if (!ret && !(ret = read_sys_metadata_field(0x9900000100000004, &val))) + sysinfo_td_conf->num_cpuid_config = val; + if (!ret && !(ret = read_sys_metadata_field(0x9900000100000008, &val))) + sysinfo_td_conf->max_vcpus_per_td = val; + if (sysinfo_td_conf->num_cpuid_config > ARRAY_SIZE(sysinfo_td_conf->cpuid_config_leaves)) + return -EINVAL; + for (i = 0; i < sysinfo_td_conf->num_cpuid_config; i++) + if (!ret && !(ret = read_sys_metadata_field(0x9900000300000400 + i, &val))) + sysinfo_td_conf->cpuid_config_leaves[i] = val; + if (sysinfo_td_conf->num_cpuid_config > ARRAY_SIZE(sysinfo_td_conf->cpuid_config_values)) + return -EINVAL; + for (i = 0; i < sysinfo_td_conf->num_cpuid_config; i++) + for (j = 0; j < 2; j++) + if (!ret && !(ret = read_sys_metadata_field(0x9900000300000500 + i * 2 + j, &val))) + sysinfo_td_conf->cpuid_config_values[i][j] = val; + + return ret; +} + static int get_tdx_sys_info(struct tdx_sys_info *sysinfo) { int ret = 0; ret = ret ?: get_tdx_sys_info_features(&sysinfo->features); ret = ret ?: get_tdx_sys_info_tdmr(&sysinfo->tdmr); + ret = ret ?: get_tdx_sys_info_td_ctrl(&sysinfo->td_ctrl); + ret = ret ?: get_tdx_sys_info_td_conf(&sysinfo->td_conf); return ret; } diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h index 6dd3c9695f59..060a2ad744bf 100644 --- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h +++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h @@ -17,9 +17,28 @@ struct tdx_sys_info_tdmr { u16 pamt_1g_entry_size; }; +struct tdx_sys_info_td_ctrl { + u16 tdr_base_size; + u16 tdcs_base_size; + u16 tdvps_base_size; +}; + +struct tdx_sys_info_td_conf { + u64 attributes_fixed0; + u64 attributes_fixed1; + u64 xfam_fixed0; + u64 xfam_fixed1; + u16 num_cpuid_config; + u16 max_vcpus_per_td; + u64 cpuid_config_leaves[128]; + u64 cpuid_config_values[128][2]; +}; + struct tdx_sys_info { struct tdx_sys_info_features features; struct tdx_sys_info_tdmr tdmr; + struct tdx_sys_info_td_ctrl td_ctrl; + struct tdx_sys_info_td_conf td_conf; }; #endif From patchwork Thu Feb 20 17:05:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984305 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2C30214A98 for ; Thu, 20 Feb 2025 17:06:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071189; cv=none; b=Nb4MA3Le7O/8ULjNMQPyuhoQePu2JuMgJWiPOcvXqnE/34ZAG/xCd4hIwJGZiCuOyHWPcy9tfzj2RQssqlMSZ7Kcx+dNs7lT4/TSEgsaAI+Gd5Is5YcU/b8Pchd6mhUIopSFzvRmoSMguQGnYyvkggkwl1oTecUEIGbpj13RP9g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071189; c=relaxed/simple; bh=MM+E2DhB6mP93V6kH9H0BWPQOci1r/gsNuB+UaxbiAY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CZlWfGhE5G93xhvyBBYAGJHfRmsb8wLfDhHuMTrmXKsZvIZ1VylDHEIcsAA8+HISsmKQquVevIT4Jb8Zq8O9EHsDYfM2twd1+Rr7XmZet4HjUsaCP3Vsr34OzBKuZZSRzg45sOrs/oImB6niH0nZKfZeZbx2yoe9A/m79XHO+lc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=cN54JIuJ; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="cN54JIuJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071186; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jxlx/bW94Y2mBtTLd29pOhhWxpKoOzPdJnxQh04qFFg=; b=cN54JIuJ4CRIz4jFtVkvCJ4H7S3cfrZTsB3TqhcagcfSD+APMaDP52gemWVH5PqHopIVL0 xV0GuyZ5sGKoVf9q41CPrlYXqg/M8MwNOketmcPUsMrHg4dwTxB5WPpnAps8Ng6FqBvHoc VexcDdYz5zgZfHqwfKwgokxZK41sm98= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-460-H-g80tNFMUm4AAs_JyMU4A-1; Thu, 20 Feb 2025 12:06:22 -0500 X-MC-Unique: H-g80tNFMUm4AAs_JyMU4A-1 X-Mimecast-MFC-AGG-ID: H-g80tNFMUm4AAs_JyMU4A_1740071181 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 318AD1800879; Thu, 20 Feb 2025 17:06:21 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1CD321800D9B; Thu, 20 Feb 2025 17:06:19 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Dave Hansen Subject: [PATCH 09/30] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Date: Thu, 20 Feb 2025 12:05:43 -0500 Message-ID: <20250220170604.2279312-10-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Isaku Yamahata Intel TDX protects guest VMs from malicious host and certain physical attacks. Pre-TDX Intel hardware has support for a memory encryption architecture called MK-TME, which repurposes several high bits of physical address as "KeyID". The BIOS reserves a sub-range of MK-TME KeyIDs as "TDX private KeyIDs". Each TDX guest must be assigned with a unique TDX KeyID when it is created. The kernel reserves the first TDX private KeyID for crypto-protection of specific TDX module data which has a lifecycle that exceeds the KeyID reserved for the TD's use. The rest of the KeyIDs are left for TDX guests to use. Create a small KeyID allocator. Export tdx_guest_keyid_alloc()/tdx_guest_keyid_free() to allocate and free TDX guest KeyID for KVM to use. Don't provide the stub functions when CONFIG_INTEL_TDX_HOST=n since they are not supposed to be called in this case. Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe Message-ID: <20241030190039.77971-5-rick.p.edgecombe@intel.com> Acked-by: Dave Hansen Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 3 +++ arch/x86/virt/vmx/tdx/tdx.c | 17 +++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 53cdb4896856..26e29a7ba6f8 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -121,6 +121,9 @@ int tdx_cpu_enable(void); int tdx_enable(void); const char *tdx_dump_mce_info(struct mce *m); +int tdx_guest_keyid_alloc(void); +void tdx_guest_keyid_free(unsigned int keyid); + struct tdx_td { /* TD root structure: */ struct page *tdr_page; diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 578ad468634b..0122051af6b3 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -43,6 +44,8 @@ static u32 tdx_global_keyid __ro_after_init; static u32 tdx_guest_keyid_start __ro_after_init; static u32 tdx_nr_guest_keyids __ro_after_init; +static DEFINE_IDA(tdx_guest_keyid_pool); + static DEFINE_PER_CPU(bool, tdx_lp_initialized); static struct tdmr_info_list tdx_tdmr_list; @@ -1459,6 +1462,20 @@ void __init tdx_init(void) check_tdx_erratum(); } +int tdx_guest_keyid_alloc(void) +{ + return ida_alloc_range(&tdx_guest_keyid_pool, tdx_guest_keyid_start, + tdx_guest_keyid_start + tdx_nr_guest_keyids - 1, + GFP_KERNEL); +} +EXPORT_SYMBOL_GPL(tdx_guest_keyid_alloc); + +void tdx_guest_keyid_free(unsigned int keyid) +{ + ida_free(&tdx_guest_keyid_pool, keyid); +} +EXPORT_SYMBOL_GPL(tdx_guest_keyid_free); + static inline u64 tdx_tdr_pa(struct tdx_td *td) { return page_to_phys(td->tdr_page); From patchwork Thu Feb 20 17:05:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984310 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15CB021324C for ; Thu, 20 Feb 2025 17:06:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071194; cv=none; b=diPM6c8vh4YR0YsrCyRjm41jFcvh0l8v7K8T49GmDIM3AAExMYp+9cTeTtpHyT6EzF9tZhpHLK0j+yXc5yXYuDAMZZUSBcOe1itD/3S64x7Ep2Um/D6aBMXe/0z/LlFHh4uB1J5B+CpyRPz4EBUDko8t8zGR/ayKaTIc4E7aOqc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071194; c=relaxed/simple; bh=WgEnRfvr5VjeDmHmE/772MoRbufnJXhWpD9DKHVCZGs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qbXq0BY3m2jhAEsjUXZN8miDpUJoXE085f89g9OCh2TSNysjvBJEssFokD9KLP33e6lGZIK79xi53GSlMUcT/vs54Rlpotod1y7Td/ZUX3PAQRV73HheU+VR6JxW8MAa99wx7V/s28ZPoHKgnmWaiu3JhpCog/HIR9pmFCoZGPE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=I8pD+FHD; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="I8pD+FHD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071189; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t9Jhpoq6MLF9SSDoFe1IcDKpKG0153ZwG7g/LjbLYXo=; b=I8pD+FHDPy132024T88cij6zkC6SDmxgqfwOt5kjxe56YDnZXUaEJFllhnBxTCdcFq8RWC ErQ/xK4TdhuiNyG+/Q5wSLJfofqJ4LH3r+h+MTMk5tn2IOhgM/h/9WkazmTDo+6D4u8PkE QJXuQ2pd014WBbmvzKhi4ctfURy/se8= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-225-EZFt9lh5M2Wv7Ec8vtK4Rg-1; Thu, 20 Feb 2025 12:06:24 -0500 X-MC-Unique: EZFt9lh5M2Wv7Ec8vtK4Rg-1 X-Mimecast-MFC-AGG-ID: EZFt9lh5M2Wv7Ec8vtK4Rg_1740071183 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1D9C519783B2; Thu, 20 Feb 2025 17:06:23 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 24620180035F; Thu, 20 Feb 2025 17:06:22 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Kai Huang Subject: [PATCH 10/30] KVM: Export hardware virtualization enabling/disabling functions Date: Thu, 20 Feb 2025 12:05:44 -0500 Message-ID: <20250220170604.2279312-11-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Kai Huang To support TDX, KVM will need to enable TDX during KVM module loading time. Enabling TDX requires enabling hardware virtualization first so that all online CPUs (and the new CPU going online) are in post-VMXON state. KVM by default enables hardware virtualization but that is done in kvm_init(), which must be the last step after all initialization is done thus is too late for enabling TDX. Export functions to enable/disable hardware virtualization so that TDX code can use them to handle hardware virtualization enabling before kvm_init(). Signed-off-by: Kai Huang Message-ID: Signed-off-by: Paolo Bonzini --- include/linux/kvm_host.h | 8 ++++++++ virt/kvm/kvm_main.c | 18 ++++-------------- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f34f4cfaa513..1e75fa114f34 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2571,4 +2571,12 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range); #endif +#ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING +int kvm_enable_virtualization(void); +void kvm_disable_virtualization(void); +#else +static inline int kvm_enable_virtualization(void) { return 0; } +static inline void kvm_disable_virtualization(void) { } +#endif + #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ba0327e2d0d3..6e40383fbe47 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -143,8 +143,6 @@ static int kvm_no_compat_open(struct inode *inode, struct file *file) #define KVM_COMPAT(c) .compat_ioctl = kvm_no_compat_ioctl, \ .open = kvm_no_compat_open #endif -static int kvm_enable_virtualization(void); -static void kvm_disable_virtualization(void); static void kvm_io_bus_destroy(struct kvm_io_bus *bus); @@ -5576,7 +5574,7 @@ static struct syscore_ops kvm_syscore_ops = { .shutdown = kvm_shutdown, }; -static int kvm_enable_virtualization(void) +int kvm_enable_virtualization(void) { int r; @@ -5621,8 +5619,9 @@ static int kvm_enable_virtualization(void) --kvm_usage_count; return r; } +EXPORT_SYMBOL_GPL(kvm_enable_virtualization); -static void kvm_disable_virtualization(void) +void kvm_disable_virtualization(void) { guard(mutex)(&kvm_usage_lock); @@ -5633,6 +5632,7 @@ static void kvm_disable_virtualization(void) cpuhp_remove_state(CPUHP_AP_KVM_ONLINE); kvm_arch_disable_virtualization(); } +EXPORT_SYMBOL_GPL(kvm_disable_virtualization); static int kvm_init_virtualization(void) { @@ -5648,21 +5648,11 @@ static void kvm_uninit_virtualization(void) kvm_disable_virtualization(); } #else /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */ -static int kvm_enable_virtualization(void) -{ - return 0; -} - static int kvm_init_virtualization(void) { return 0; } -static void kvm_disable_virtualization(void) -{ - -} - static void kvm_uninit_virtualization(void) { From patchwork Thu Feb 20 17:05:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984308 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3741921519A for ; Thu, 20 Feb 2025 17:06:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071193; cv=none; b=Ar8q0shXCzOJuLsxekxJTTrQQC7vfPBm7rIopcCkmlzj6TWoAkeQdQYHHjKxUagulDud9sDP26ihpLXBpiv9xDeHK+3tMnmw0rYNI4mS0o4PnB5t7zFErnaUO4RCwxDCSJfeypdimOW2IUf9COB8qD7zDihl75+laWpKrVNFrVM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071193; c=relaxed/simple; bh=kTtQMfy0OPRiPKBZHPizpbmfPChuFBCdB4JGHlsRg48=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NAGsaKWHRD3TYcAJPfiEnzGDZAsRlLJ6svEB3Uc7E2CybFAnFqbaXf75Og4HqndWc43knRoEOa5MMqif8VDohjX3o7XZqJxF1j4OelAlXMny+4kgjp+z/cH8XBojlygTi0nTfxghQhuQEQQxDgkb6owOcODb3JaZ2kZTnCGMaKc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UlPP4oEd; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UlPP4oEd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071190; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NdSwM6D2J4s+/gws2Vf50KU/wQdM8P/TNvP4B/YaWQE=; b=UlPP4oEdl6VYemttYnra9Nvcaq8iuENGEuXkcADdAhHFOlfWJshGhG4NT9Yesdy8u25ob1 hxqRH3urOQ3keXU5ptqddCYu+x9T8VerhdF0oB3H9WVwU7RHEsnndcol23UkmU/Liyd2rb KBrVm0t1DmtwhvevDBZBNMqMah4JlPE= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-307-swlP6Zl9OqmNpELfhhS6Iw-1; Thu, 20 Feb 2025 12:06:26 -0500 X-MC-Unique: swlP6Zl9OqmNpELfhhS6Iw-1 X-Mimecast-MFC-AGG-ID: swlP6Zl9OqmNpELfhhS6Iw_1740071184 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 451BA19783B5; Thu, 20 Feb 2025 17:06:24 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4F5B81800359; Thu, 20 Feb 2025 17:06:23 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Kai Huang Subject: [PATCH 11/30] KVM: VMX: Refactor VMX module init/exit functions Date: Thu, 20 Feb 2025 12:05:45 -0500 Message-ID: <20250220170604.2279312-12-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Kai Huang Add vt_init() and vt_exit() as the new module init/exit functions and refactor existing vmx_init()/vmx_exit() as helper to make room for TDX specific initialization and teardown. To support TDX, KVM will need to enable TDX during KVM module loading time. Enabling TDX requires enabling hardware virtualization first so that all online CPUs (and the new CPU going online) are in post-VMXON state. Currently, the vmx_init() flow is: 1) hv_init_evmcs(), 2) kvm_x86_vendor_init(), 3) Other VMX specific initialization, 4) kvm_init() The kvm_x86_vendor_init() invokes kvm_x86_init_ops::hardware_setup() to do VMX specific hardware setup and calls kvm_update_ops() to initialize kvm_x86_ops to VMX's version. TDX will have its own version for most of kvm_x86_ops callbacks. It would be nice if kvm_x86_init_ops::hardware_setup() could also be used for TDX, but in practice it cannot. The reason is, as mentioned above, TDX initialization requires hardware virtualization having been enabled, which must happen after kvm_update_ops(), but hardware_setup() is done before that. Also, TDX is based on VMX, and it makes sense to only initialize TDX after VMX has been initialized. If VMX fails to initialize, TDX is likely broken anyway. So the new flow of KVM module init function will be: 1) Current VMX initialization code in vmx_init() before kvm_init(), 2) TDX initialization, 3) kvm_init() Split vmx_init() into two parts based on above 1) and 3) so that TDX initialization can fit in between. Make part 1) as the new helper vmx_init(). Introduce vt_init() as the new module init function which calls vmx_init() and kvm_init(). TDX initialization will be added later. Do the same thing for vmx_exit()/vt_exit(). Signed-off-by: Kai Huang Message-ID: <3f23f24098bdcf42e213798893ffff7cdc7103be.1731664295.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 32 ++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 25 +++---------------------- arch/x86/kvm/vmx/vmx.h | 3 +++ 3 files changed, 38 insertions(+), 22 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 43ee9ed11291..54cf95cb8d42 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -168,3 +168,35 @@ struct kvm_x86_init_ops vt_init_ops __initdata = { .runtime_ops = &vt_x86_ops, .pmu_ops = &intel_pmu_ops, }; + +static void __exit vt_exit(void) +{ + kvm_exit(); + vmx_exit(); +} +module_exit(vt_exit); + +static int __init vt_init(void) +{ + int r; + + r = vmx_init(); + if (r) + return r; + + /* + * Common KVM initialization _must_ come last, after this, /dev/kvm is + * exposed to userspace! + */ + r = kvm_init(sizeof(struct vcpu_vmx), __alignof__(struct vcpu_vmx), + THIS_MODULE); + if (r) + goto err_kvm_init; + + return 0; + +err_kvm_init: + vmx_exit(); + return r; +} +module_init(vt_init); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6c56d5235f0f..8d3cfef0cf3b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8588,7 +8588,7 @@ __init int vmx_hardware_setup(void) return r; } -static void vmx_cleanup_l1d_flush(void) +static void __exit vmx_cleanup_l1d_flush(void) { if (vmx_l1d_flush_pages) { free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER); @@ -8598,23 +8598,16 @@ static void vmx_cleanup_l1d_flush(void) l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO; } -static void __vmx_exit(void) +void __exit vmx_exit(void) { allow_smaller_maxphyaddr = false; vmx_cleanup_l1d_flush(); -} -static void __exit vmx_exit(void) -{ - kvm_exit(); - __vmx_exit(); kvm_x86_vendor_exit(); - } -module_exit(vmx_exit); -static int __init vmx_init(void) +int __init vmx_init(void) { int r, cpu; @@ -8658,21 +8651,9 @@ static int __init vmx_init(void) if (!enable_ept) allow_smaller_maxphyaddr = true; - /* - * Common KVM initialization _must_ come last, after this, /dev/kvm is - * exposed to userspace! - */ - r = kvm_init(sizeof(struct vcpu_vmx), __alignof__(struct vcpu_vmx), - THIS_MODULE); - if (r) - goto err_kvm_init; - return 0; -err_kvm_init: - __vmx_exit(); err_l1d_flush: kvm_x86_vendor_exit(); return r; } -module_init(vmx_init); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 8b111ce1087c..299fa1edf534 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -760,4 +760,7 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx) vmx->segment_cache.bitmask = 0; } +int vmx_init(void); +void vmx_exit(void); + #endif /* __KVM_X86_VMX_H */ From patchwork Thu Feb 20 17:05:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984311 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2518217707 for ; Thu, 20 Feb 2025 17:06:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071197; cv=none; b=daxZ3PPI7xCyOX6jFWwQ8ZAPHDGWWLSczEnGVmcKoLWzNlpeaoxrwnWhVZPnk04ahQ/CIkeRlCutm8nY48p4hyPqU4BW2G9oOhzFa25CFBNs5WleT3xS1ooIxW41OdIy++KDhd6elbyp8gg9fy7Yq5EGeeo4X1YFv+1iw/Jt7yk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071197; c=relaxed/simple; bh=GPXl415FM6J1pzh7uDDHxyS5XkBrQQGcgJ7b35KNxNc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CTpWecv76PCOlaywxdqn+Ugwj0B07ZjOnhg27MWgDy5Jj2wf/eTjA+gmP01Y91ikhsYch7/zJyYcoYO2W3ar0IviG8PAh6kMwlhGuYOIzGoerDPT0uWpyP+HhU4CgSHuGe+4/kHciATzJZfrYfSwiokxSU/3Lmn0UR1D2j8xUKE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=YsMIij+t; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YsMIij+t" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071194; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Tycj0fO2fWePCC4bBBgnozZwpZn7IjOBveqRgM6PkYQ=; b=YsMIij+tddGU+jzNhz7C8UFKHLeazDwzyu5lNZc6yriAbu8cT0DAy7kL5ZUuIckFA4uB9p y+HXrm6Owb5bA+3azaVz0LHpJEb5N801PeZb82wnwLCzS6eRuciLM/jOw2J6wFxGw0B1Ra wdBNADk1cc2Nmg5JNW/JdBCCDCqUoHo= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-5-94aRXSPPO_ulyQzndo38ng-1; Thu, 20 Feb 2025 12:06:31 -0500 X-MC-Unique: 94aRXSPPO_ulyQzndo38ng-1 X-Mimecast-MFC-AGG-ID: 94aRXSPPO_ulyQzndo38ng_1740071185 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6F46E1800983; Thu, 20 Feb 2025 17:06:25 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 79BDA1800359; Thu, 20 Feb 2025 17:06:24 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Kai Huang Subject: [PATCH 12/30] KVM: VMX: Initialize TDX during KVM module load Date: Thu, 20 Feb 2025 12:05:46 -0500 Message-ID: <20250220170604.2279312-13-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Kai Huang Before KVM can use TDX to create and run TDX guests, TDX needs to be initialized from two perspectives: 1) TDX module must be initialized properly to a working state; 2) A per-cpu TDX initialization, a.k.a the TDH.SYS.LP.INIT SEAMCALL must be done on any logical cpu before it can run any other TDX SEAMCALLs. The TDX host core-kernel provides two functions to do the above two respectively: tdx_enable() and tdx_cpu_enable(). There are two options in terms of when to initialize TDX: initialize TDX at KVM module loading time, or when creating the first TDX guest. Choose to initialize TDX during KVM module loading time: Initializing TDX module is both memory and CPU time consuming: 1) the kernel needs to allocate a non-trivial size(~1/256) of system memory as metadata used by TDX module to track each TDX-usable memory page's status; 2) the TDX module needs to initialize this metadata, one entry for each TDX-usable memory page. Also, the kernel uses alloc_contig_pages() to allocate those metadata chunks, because they are large and need to be physically contiguous. alloc_contig_pages() can fail. If initializing TDX when creating the first TDX guest, then there's chance that KVM won't be able to run any TDX guests albeit KVM _declares_ to be able to support TDX. This isn't good for the user. On the other hand, initializing TDX at KVM module loading time can make sure KVM is providing a consistent view of whether KVM can support TDX to the user. Always only try to initialize TDX after VMX has been initialized. TDX is based on VMX, and if VMX fails to initialize then TDX is likely to be broken anyway. Also, in practice, supporting TDX will require part of VMX and common x86 infrastructure in working order, so TDX cannot be enabled alone w/o VMX support. There are two cases that can result in failure to initialize TDX: 1) TDX cannot be supported (e.g., because of TDX is not supported or enabled by hardware, or module is not loaded, or missing some dependency in KVM's configuration); 2) Any unexpected error during TDX bring-up. For the first case only mark TDX is disabled but still allow KVM module to be loaded. For the second case just fail to load the KVM module so that the user can be aware. Because TDX costs additional memory, don't enable TDX by default. Add a new module parameter 'enable_tdx' to allow the user to opt-in. Note, the name tdx_init() has already been taken by the early boot code. Use tdx_bringup() for initializing TDX (and tdx_cleanup() since KVM doesn't actually teardown TDX). They don't match vt_init()/vt_exit(), vmx_init()/vmx_exit() etc but it's not end of the world. Also, once initialized, the TDX module cannot be disabled and enabled again w/o the TDX module runtime update, which isn't supported by the kernel. After TDX is enabled, nothing needs to be done when KVM disables hardware virtualization, e.g., when offlining CPU, or during suspend/resume. TDX host core-kernel code internally tracks TDX status and can handle "multiple enabling" scenario. Similar to KVM_AMD_SEV, add a new KVM_INTEL_TDX Kconfig to guide KVM TDX code. Make it depend on INTEL_TDX_HOST but not replace INTEL_TDX_HOST because in the longer term there's a use case that requires making SEAMCALLs w/o KVM as mentioned by Dan [1]. Link: https://lore.kernel.org/6723fc2070a96_60c3294dc@dwillia2-mobl3.amr.corp.intel.com.notmuch/ [1] Signed-off-by: Kai Huang Message-ID: <162f9dee05c729203b9ad6688db1ca2960b4b502.1731664295.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 3 + .../tdx => include/asm}/tdx_global_metadata.h | 0 arch/x86/kvm/Kconfig | 10 ++ arch/x86/kvm/Makefile | 1 + arch/x86/kvm/vmx/main.c | 9 + arch/x86/kvm/vmx/tdx.c | 160 ++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 13 ++ arch/x86/virt/vmx/tdx/tdx.c | 14 ++ arch/x86/virt/vmx/tdx/tdx.h | 1 - include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 3 +- 11 files changed, 213 insertions(+), 2 deletions(-) rename arch/x86/{virt/vmx/tdx => include/asm}/tdx_global_metadata.h (100%) create mode 100644 arch/x86/kvm/vmx/tdx.c create mode 100644 arch/x86/kvm/vmx/tdx.h diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 26e29a7ba6f8..52a21075c0a6 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -33,6 +33,7 @@ #ifndef __ASSEMBLY__ #include +#include #include /* @@ -120,6 +121,7 @@ static inline u64 sc_retry(sc_func_t func, u64 fn, int tdx_cpu_enable(void); int tdx_enable(void); const char *tdx_dump_mce_info(struct mce *m); +const struct tdx_sys_info *tdx_get_sysinfo(void); int tdx_guest_keyid_alloc(void); void tdx_guest_keyid_free(unsigned int keyid); @@ -178,6 +180,7 @@ static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } static inline int tdx_enable(void) { return -ENODEV; } static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; } +static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NULL; } #endif /* CONFIG_INTEL_TDX_HOST */ #endif /* !__ASSEMBLY__ */ diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h b/arch/x86/include/asm/tdx_global_metadata.h similarity index 100% rename from arch/x86/virt/vmx/tdx/tdx_global_metadata.h rename to arch/x86/include/asm/tdx_global_metadata.h diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index ea2c4f21c1ca..fe8cbee6f614 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -128,6 +128,16 @@ config X86_SGX_KVM If unsure, say N. +config KVM_INTEL_TDX + bool "Intel Trust Domain Extensions (TDX) support" + default y + depends on INTEL_TDX_HOST + help + Provides support for launching Intel Trust Domain Extensions (TDX) + confidential VMs on Intel processors. + + If unsure, say N. + config KVM_AMD tristate "KVM for AMD processors support" depends on KVM && (CPU_SUP_AMD || CPU_SUP_HYGON) diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index f9dddb8cb466..a5d362c7b504 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -20,6 +20,7 @@ kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \ kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o kvm-intel-$(CONFIG_KVM_HYPERV) += vmx/hyperv.o vmx/hyperv_evmcs.o +kvm-intel-$(CONFIG_KVM_INTEL_TDX) += vmx/tdx.o kvm-amd-y += svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 54cf95cb8d42..97c453187cc1 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -6,6 +6,7 @@ #include "nested.h" #include "pmu.h" #include "posted_intr.h" +#include "tdx.h" #define VMX_REQUIRED_APICV_INHIBITS \ (BIT(APICV_INHIBIT_REASON_DISABLED) | \ @@ -172,6 +173,7 @@ struct kvm_x86_init_ops vt_init_ops __initdata = { static void __exit vt_exit(void) { kvm_exit(); + tdx_cleanup(); vmx_exit(); } module_exit(vt_exit); @@ -184,6 +186,11 @@ static int __init vt_init(void) if (r) return r; + /* tdx_init() has been taken */ + r = tdx_bringup(); + if (r) + goto err_tdx_bringup; + /* * Common KVM initialization _must_ come last, after this, /dev/kvm is * exposed to userspace! @@ -196,6 +203,8 @@ static int __init vt_init(void) return 0; err_kvm_init: + tdx_cleanup(); +err_tdx_bringup: vmx_exit(); return r; } diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c new file mode 100644 index 000000000000..0666dfbe0bc0 --- /dev/null +++ b/arch/x86/kvm/vmx/tdx.c @@ -0,0 +1,160 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include "capabilities.h" +#include "tdx.h" + +#undef pr_fmt +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +static bool enable_tdx __ro_after_init; +module_param_named(tdx, enable_tdx, bool, 0444); + +static enum cpuhp_state tdx_cpuhp_state; + +static int tdx_online_cpu(unsigned int cpu) +{ + unsigned long flags; + int r; + + /* Sanity check CPU is already in post-VMXON */ + WARN_ON_ONCE(!(cr4_read_shadow() & X86_CR4_VMXE)); + + local_irq_save(flags); + r = tdx_cpu_enable(); + local_irq_restore(flags); + + return r; +} + +static void __do_tdx_cleanup(void) +{ + /* + * Once TDX module is initialized, it cannot be disabled and + * re-initialized again w/o runtime update (which isn't + * supported by kernel). Only need to remove the cpuhp here. + * The TDX host core code tracks TDX status and can handle + * 'multiple enabling' scenario. + */ + WARN_ON_ONCE(!tdx_cpuhp_state); + cpuhp_remove_state_nocalls(tdx_cpuhp_state); + tdx_cpuhp_state = 0; +} + +static int __init __do_tdx_bringup(void) +{ + int r; + + /* + * TDX-specific cpuhp callback to call tdx_cpu_enable() on all + * online CPUs before calling tdx_enable(), and on any new + * going-online CPU to make sure it is ready for TDX guest. + */ + r = cpuhp_setup_state_cpuslocked(CPUHP_AP_ONLINE_DYN, + "kvm/cpu/tdx:online", + tdx_online_cpu, NULL); + if (r < 0) + return r; + + tdx_cpuhp_state = r; + + r = tdx_enable(); + if (r) + __do_tdx_cleanup(); + + return r; +} + +static bool __init kvm_can_support_tdx(void) +{ + return cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM); +} + +static int __init __tdx_bringup(void) +{ + int r; + + /* + * Enabling TDX requires enabling hardware virtualization first, + * as making SEAMCALLs requires CPU being in post-VMXON state. + */ + r = kvm_enable_virtualization(); + if (r) + return r; + + cpus_read_lock(); + r = __do_tdx_bringup(); + cpus_read_unlock(); + + if (r) + goto tdx_bringup_err; + + /* + * Leave hardware virtualization enabled after TDX is enabled + * successfully. TDX CPU hotplug depends on this. + */ + return 0; +tdx_bringup_err: + kvm_disable_virtualization(); + return r; +} + +void tdx_cleanup(void) +{ + if (enable_tdx) { + __do_tdx_cleanup(); + kvm_disable_virtualization(); + } +} + +int __init tdx_bringup(void) +{ + int r; + + if (!enable_tdx) + return 0; + + if (!kvm_can_support_tdx()) { + pr_err("tdx: no TDX private KeyIDs available\n"); + goto success_disable_tdx; + } + + if (!enable_virt_at_load) { + pr_err("tdx: tdx requires kvm.enable_virt_at_load=1\n"); + goto success_disable_tdx; + } + + /* + * Ideally KVM should probe whether TDX module has been loaded + * first and then try to bring it up. But TDX needs to use SEAMCALL + * to probe whether the module is loaded (there is no CPUID or MSR + * for that), and making SEAMCALL requires enabling virtualization + * first, just like the rest steps of bringing up TDX module. + * + * So, for simplicity do everything in __tdx_bringup(); the first + * SEAMCALL will return -ENODEV when the module is not loaded. The + * only complication is having to make sure that initialization + * SEAMCALLs don't return TDX_SEAMCALL_VMFAILINVALID in other + * cases. + */ + r = __tdx_bringup(); + if (r) { + /* + * Disable TDX only but don't fail to load module if + * the TDX module could not be loaded. No need to print + * message saying "module is not loaded" because it was + * printed when the first SEAMCALL failed. + */ + if (r == -ENODEV) + goto success_disable_tdx; + + enable_tdx = 0; + } + + return r; + +success_disable_tdx: + enable_tdx = 0; + return 0; +} diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h new file mode 100644 index 000000000000..8aee938a968f --- /dev/null +++ b/arch/x86/kvm/vmx/tdx.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __KVM_X86_VMX_TDX_H +#define __KVM_X86_VMX_TDX_H + +#ifdef CONFIG_INTEL_TDX_HOST +int tdx_bringup(void); +void tdx_cleanup(void); +#else +static inline int tdx_bringup(void) { return 0; } +static inline void tdx_cleanup(void) {} +#endif + +#endif diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 0122051af6b3..9f0c482c1a03 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1462,6 +1462,20 @@ void __init tdx_init(void) check_tdx_erratum(); } +const struct tdx_sys_info *tdx_get_sysinfo(void) +{ + const struct tdx_sys_info *p = NULL; + + /* Make sure all fields in @tdx_sysinfo have been populated */ + mutex_lock(&tdx_module_lock); + if (tdx_module_status == TDX_MODULE_INITIALIZED) + p = (const struct tdx_sys_info *)&tdx_sysinfo; + mutex_unlock(&tdx_module_lock); + + return p; +} +EXPORT_SYMBOL_GPL(tdx_get_sysinfo); + int tdx_guest_keyid_alloc(void) { return ida_alloc_range(&tdx_guest_keyid_pool, tdx_guest_keyid_start, diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 62cb7832c42d..da384387d4eb 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -3,7 +3,6 @@ #define _X86_VIRT_TDX_H #include -#include "tdx_global_metadata.h" /* * This file contains both macros and data structures defined by the TDX diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 1e75fa114f34..3bfe3140f444 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2284,6 +2284,7 @@ static inline bool kvm_check_request(int req, struct kvm_vcpu *vcpu) } #ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING +extern bool enable_virt_at_load; extern bool kvm_rebooting; #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 6e40383fbe47..622b5a99078a 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5464,8 +5464,9 @@ static struct miscdevice kvm_dev = { }; #ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING -static bool enable_virt_at_load = true; +bool enable_virt_at_load = true; module_param(enable_virt_at_load, bool, 0444); +EXPORT_SYMBOL_GPL(enable_virt_at_load); __visible bool kvm_rebooting; EXPORT_SYMBOL_GPL(kvm_rebooting); From patchwork Thu Feb 20 17:05:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984307 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3E2E215193 for ; Thu, 20 Feb 2025 17:06:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071192; cv=none; b=HnM15ZWw/uPYSpTo6uyJLwKmzwDfM1NvGzFiiHEHo/G55So6YVIiL6sBAz4KAxxb2Od6nI0daaDZrLEfp0As4snHPl6UN78czAVNnH5LjD0X7IW3xfCGdGOfBqyBfV50LHxx331tsGfZpz2G455Q4u1l4k505TfeIlfSB1Oj70A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071192; c=relaxed/simple; bh=pGjkwu74Qac4zZwnPOH9vIK6+RLhm4IV0FQFk3WoURc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=o7Ubvx72hJcoQzNPPXou1gzhZlVk39So7e8dJJoYuse5EZhIE5sae5b6cGJGKhKo//wN/3+dxk53O27UD1FSlP0pkORsT9vBjS4oIo3yXpkZUfFXb25S7bENFW1jdADR9VMUDhVqk+8oHQsIDDxWuIFr/bTAnPTm5Ur+RgaBAQs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=AD5Jq7vJ; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="AD5Jq7vJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071190; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8f7EzK9RxrNnHIDCrt2FIGl1J5d6f6j5FQ75ZzYv2x8=; b=AD5Jq7vJKRi0NF0qNiU0s4FstIKc3K6/RerNrCubm8EiZjWVHnnm+sa60IYg14l02MyUdK 0lqvBGWMGbeL994U0keAhfDstfY7/k4/i4WmvOpqid88Y95LvmvBJSsxOugTHOPp319KuF i2XU4v65PbQyrO55ZbK5Bs4plSCH/PA= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-634-23NEXoZHOlSZ-7fkiuoUGA-1; Thu, 20 Feb 2025 12:06:27 -0500 X-MC-Unique: 23NEXoZHOlSZ-7fkiuoUGA-1 X-Mimecast-MFC-AGG-ID: 23NEXoZHOlSZ-7fkiuoUGA_1740071186 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 979E81800992; Thu, 20 Feb 2025 17:06:26 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A3D3A1800359; Thu, 20 Feb 2025 17:06:25 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Kai Huang Subject: [PATCH 13/30] KVM: TDX: Get TDX global information Date: Thu, 20 Feb 2025 12:05:47 -0500 Message-ID: <20250220170604.2279312-14-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Kai Huang KVM will need to consult some essential TDX global information to create and run TDX guests. Get the global information after initializing TDX. Signed-off-by: Kai Huang Signed-off-by: Rick Edgecombe Message-ID: <20241030190039.77971-3-rick.p.edgecombe@intel.com> Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 0666dfbe0bc0..761d3a9cd5c5 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -13,6 +13,8 @@ module_param_named(tdx, enable_tdx, bool, 0444); static enum cpuhp_state tdx_cpuhp_state; +static const struct tdx_sys_info *tdx_sysinfo; + static int tdx_online_cpu(unsigned int cpu) { unsigned long flags; @@ -90,11 +92,20 @@ static int __init __tdx_bringup(void) if (r) goto tdx_bringup_err; + /* Get TDX global information for later use */ + tdx_sysinfo = tdx_get_sysinfo(); + if (WARN_ON_ONCE(!tdx_sysinfo)) { + r = -EINVAL; + goto get_sysinfo_err; + } + /* * Leave hardware virtualization enabled after TDX is enabled * successfully. TDX CPU hotplug depends on this. */ return 0; +get_sysinfo_err: + __do_tdx_cleanup(); tdx_bringup_err: kvm_disable_virtualization(); return r; From patchwork Thu Feb 20 17:05:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984309 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A33152153CD for ; Thu, 20 Feb 2025 17:06:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071193; cv=none; b=tDTaeZSSlcFBYwLUzb0YrWM3Tab+XTQ3E1kXA+W7NAdtr9MtlkTsl1AXDBjE5iUEdeLlddmTXWmrfBjJknzL4yPuKC9Xn38KUFDFc3c8XA+V7/N5w6ihdOFmh53Odnjf1LFQUDUA86A71JAb2hzMscmO2LuFsBcZ5AGyfM4MOBA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071193; c=relaxed/simple; bh=39q2NNpf7SneAFLKlSbxcFeKHe+A1MhvK9SOsd/CvlU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ufr/lZLpT7KWWmhWH+8TB9qQccY48ObSDRzsddO64u8maeThUVtpAAKPQgb7laZfBMklPTjz2o4b7Df5PupetyvadfUFdk4OLZa6IwczrqJ2HuY090OQkEWUD7v0MI0pYDkLzNy6mnk8kNm3plWGHTD4l3Pa4qbYIfllu3mOwg8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=BdZiZrxS; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BdZiZrxS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071190; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fwyfFAMMWaAGgPHYwrleuHn9hlLr32V8u+eWlGmJlJo=; b=BdZiZrxSsr8tpehVN3eLEU4TamRmWrTItqNRkbilTsDXElfEV7QcjDiSknizOyWThRtvNg FTsKNln87nvzcKUhKdRcH/Ft0u+7aPKdYvs42n1F7oVoVvwdQbqg2mSKC+MHaJUIVmptRe l6ERXPibVGdmCUAbIWmrlomsIYUZ9+M= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-66-tBvkLz3DOAaRMmToqhm9HQ-1; Thu, 20 Feb 2025 12:06:29 -0500 X-MC-Unique: tBvkLz3DOAaRMmToqhm9HQ-1 X-Mimecast-MFC-AGG-ID: tBvkLz3DOAaRMmToqhm9HQ_1740071187 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C5B9A1903080; Thu, 20 Feb 2025 17:06:27 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id CDF8A1800D9D; Thu, 20 Feb 2025 17:06:26 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata Subject: [PATCH 14/30] KVM: TDX: Add placeholders for TDX VM/vCPU structures Date: Thu, 20 Feb 2025 12:05:48 -0500 Message-ID: <20250220170604.2279312-15-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Isaku Yamahata Add TDX's own VM and vCPU structures as placeholder to manage and run TDX guests. Also add helper functions to check whether a VM/vCPU is TDX or normal VMX one, and add helpers to convert between TDX VM/vCPU and KVM VM/vCPU. TDX protects guest VMs from malicious host. Unlike VMX guests, TDX guests are crypto-protected. KVM cannot access TDX guests' memory and vCPU states directly. Instead, TDX requires KVM to use a set of TDX architecture-defined firmware APIs (a.k.a TDX module SEAMCALLs) to manage and run TDX guests. In fact, the way to manage and run TDX guests and normal VMX guests are quite different. Because of that, the current structures ('struct kvm_vmx' and 'struct vcpu_vmx') to manage VMX guests are not quite suitable for TDX guests. E.g., the majority of the members of 'struct vcpu_vmx' don't apply to TDX guests. Introduce TDX's own VM and vCPU structures ('struct kvm_tdx' and 'struct vcpu_tdx' respectively) for KVM to manage and run TDX guests. And instead of building TDX's VM and vCPU structures based on VMX's, build them directly based on 'struct kvm'. As a result, TDX and VMX guests will have different VM size and vCPU size/alignment. Currently, kvm_arch_alloc_vm() uses 'kvm_x86_ops::vm_size' to allocate enough space for the VM structure when creating guest. With TDX guests, ideally, KVM should allocate the VM structure based on the VM type so that the precise size can be allocated for VMX and TDX guests. But this requires more extensive code change. For now, simply choose the maximum size of 'struct kvm_tdx' and 'struct kvm_vmx' for VM structure allocation for both VMX and TDX guests. This would result in small memory waste for each VM which has smaller VM structure size but this is acceptable. For simplicity, use the same way for vCPU allocation too. Otherwise KVM would need to maintain a separate 'kvm_vcpu_cache' for each VM type. Note, updating the 'vt_x86_ops::vm_size' needs to be done before calling kvm_ops_update(), which copies vt_x86_ops to kvm_x86_ops. However this happens before TDX module initialization. Therefore theoretically it is possible that 'kvm_x86_ops::vm_size' is set to size of 'struct kvm_tdx' (when it's larger) but TDX actually fails to initialize at a later time. Again the worst case of this is wasting couple of bytes memory for each VM. KVM could choose to update 'kvm_x86_ops::vm_size' at a later time depending on TDX's status but that would require base KVM module to export either kvm_x86_ops or kvm_ops_update(). Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe Signed-off-by: Paolo Bonzini --- - Make to_kvm_tdx() and to_tdx() private to tdx.c (Francesco, Tony) - Add pragma poison for to_vmx() (Paolo) Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 53 ++++++++++++++++++++++++++++++++++++++--- arch/x86/kvm/vmx/tdx.c | 14 ++++++++++- arch/x86/kvm/vmx/tdx.h | 37 ++++++++++++++++++++++++++++ 3 files changed, 100 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 97c453187cc1..5c1acc88feef 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -8,6 +8,39 @@ #include "posted_intr.h" #include "tdx.h" +static __init int vt_hardware_setup(void) +{ + int ret; + + ret = vmx_hardware_setup(); + if (ret) + return ret; + + /* + * Update vt_x86_ops::vm_size here so it is ready before + * kvm_ops_update() is called in kvm_x86_vendor_init(). + * + * Note, the actual bringing up of TDX must be done after + * kvm_ops_update() because enabling TDX requires enabling + * hardware virtualization first, i.e., all online CPUs must + * be in post-VMXON state. This means the @vm_size here + * may be updated to TDX's size but TDX may fail to enable + * at later time. + * + * The VMX/VT code could update kvm_x86_ops::vm_size again + * after bringing up TDX, but this would require exporting + * either kvm_x86_ops or kvm_ops_update() from the base KVM + * module, which looks overkill. Anyway, the worst case here + * is KVM may allocate couple of more bytes than needed for + * each VM. + */ + if (enable_tdx) + vt_x86_ops.vm_size = max_t(unsigned int, vt_x86_ops.vm_size, + sizeof(struct kvm_tdx)); + + return 0; +} + #define VMX_REQUIRED_APICV_INHIBITS \ (BIT(APICV_INHIBIT_REASON_DISABLED) | \ BIT(APICV_INHIBIT_REASON_ABSENT) | \ @@ -163,7 +196,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { }; struct kvm_x86_init_ops vt_init_ops __initdata = { - .hardware_setup = vmx_hardware_setup, + .hardware_setup = vt_hardware_setup, .handle_intel_pt_intr = NULL, .runtime_ops = &vt_x86_ops, @@ -180,6 +213,7 @@ module_exit(vt_exit); static int __init vt_init(void) { + unsigned vcpu_size, vcpu_align; int r; r = vmx_init(); @@ -191,12 +225,25 @@ static int __init vt_init(void) if (r) goto err_tdx_bringup; + /* + * TDX and VMX have different vCPU structures. Calculate the + * maximum size/align so that kvm_init() can use the larger + * values to create the kmem_vcpu_cache. + */ + vcpu_size = sizeof(struct vcpu_vmx); + vcpu_align = __alignof__(struct vcpu_vmx); + if (enable_tdx) { + vcpu_size = max_t(unsigned, vcpu_size, + sizeof(struct vcpu_tdx)); + vcpu_align = max_t(unsigned, vcpu_align, + __alignof__(struct vcpu_tdx)); + } + /* * Common KVM initialization _must_ come last, after this, /dev/kvm is * exposed to userspace! */ - r = kvm_init(sizeof(struct vcpu_vmx), __alignof__(struct vcpu_vmx), - THIS_MODULE); + r = kvm_init(vcpu_size, vcpu_align, THIS_MODULE); if (r) goto err_kvm_init; diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 761d3a9cd5c5..a830e9c42bf2 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -5,16 +5,28 @@ #include "capabilities.h" #include "tdx.h" +#pragma GCC poison to_vmx + #undef pr_fmt #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt -static bool enable_tdx __ro_after_init; +bool enable_tdx __ro_after_init; module_param_named(tdx, enable_tdx, bool, 0444); static enum cpuhp_state tdx_cpuhp_state; static const struct tdx_sys_info *tdx_sysinfo; +static __always_inline struct kvm_tdx *to_kvm_tdx(struct kvm *kvm) +{ + return container_of(kvm, struct kvm_tdx, kvm); +} + +static __always_inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) +{ + return container_of(vcpu, struct vcpu_tdx, vcpu); +} + static int tdx_online_cpu(unsigned int cpu) { unsigned long flags; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 8aee938a968f..fc013c8816f1 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -5,9 +5,46 @@ #ifdef CONFIG_INTEL_TDX_HOST int tdx_bringup(void); void tdx_cleanup(void); + +extern bool enable_tdx; + +struct kvm_tdx { + struct kvm kvm; + /* TDX specific members follow. */ +}; + +struct vcpu_tdx { + struct kvm_vcpu vcpu; + /* TDX specific members follow. */ +}; + +static inline bool is_td(struct kvm *kvm) +{ + return kvm->arch.vm_type == KVM_X86_TDX_VM; +} + +static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) +{ + return is_td(vcpu->kvm); +} + #else static inline int tdx_bringup(void) { return 0; } static inline void tdx_cleanup(void) {} + +#define enable_tdx 0 + +struct kvm_tdx { + struct kvm kvm; +}; + +struct vcpu_tdx { + struct kvm_vcpu vcpu; +}; + +static inline bool is_td(struct kvm *kvm) { return false; } +static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) { return false; } + #endif #endif From patchwork Thu Feb 20 17:05:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984312 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4489E217713 for ; Thu, 20 Feb 2025 17:06:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071198; cv=none; b=J/3ts76vVG3Sow4ipBEteW3RnDX4TfpBGRZ0zFzSIzglXLqAxNqEbF9fDQsLGhOfGdgV2kzsB5xXlxm/8suh+lfIWZM+s1uZI+KCwcMq4j2IF3l/5TyH+rMICA4CoLeTxMGKuGE27fDt9pgUyBowRJ6Rj3NulP2FAXLNT5xUOos= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071198; c=relaxed/simple; bh=eu50HJKfyrCTMD4ztoEkCZGNty0iOhs0JtUIdC5Ri1o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tiD/TROtrzMvOLxe3NklEcUxRgQ2OKbaum1opxR8xTsYHP6ONzBeO98sqgzKRnu/r90pc4phF//lkuAaFBJwj5uwPKlwutK1EkuzE+7tm2j45WznhQG04jI/OSwhsRENDDkjrC2gqL2YhiiOP9r0b2iT34v+WFsgf/ujkpqKp/Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=KWIfDiYQ; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="KWIfDiYQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071195; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bVGdK/kkfSDfuv7grRfMiXUUF7ivbIYfwlJKDtAOPok=; b=KWIfDiYQNumotQeNhykQiK27No7quBgq2Er7N5c+clGnDsd1ZRW2SGzqmk5ZpvsTG3GJZA kIPXRJUcxrtMUS+qdJ7j9E9+y7hWAz0wEXxLyDsU6stfG6lBzdqUJWkfdIpxLv/HX0rUgi qMysveZkVEDHbwNt172nca7Xt1+P9vE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-481-y8DucBQzNgy_W9jIFAwHYA-1; Thu, 20 Feb 2025 12:06:31 -0500 X-MC-Unique: y8DucBQzNgy_W9jIFAwHYA-1 X-Mimecast-MFC-AGG-ID: y8DucBQzNgy_W9jIFAwHYA_1740071189 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4FF121800878; Thu, 20 Feb 2025 17:06:29 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 052F2180035F; Thu, 20 Feb 2025 17:06:27 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Tony Lindgren , Sean Christopherson , Xiaoyao Li Subject: [PATCH 15/30] KVM: TDX: Define TDX architectural definitions Date: Thu, 20 Feb 2025 12:05:49 -0500 Message-ID: <20250220170604.2279312-16-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Isaku Yamahata Define architectural definitions for KVM to issue the TDX SEAMCALLs. Structures and values that are architecturally defined in the TDX module specifications the chapter of ABI Reference. Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe Reviewed-by: Paolo Bonzini Reviewed-by: Xiaoyao Li --- - Drop old duplicate defines, the x86 core exports what's needed (Kai) Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.h | 2 + arch/x86/kvm/vmx/tdx_arch.h | 123 ++++++++++++++++++++++++++++++++++++ 2 files changed, 125 insertions(+) create mode 100644 arch/x86/kvm/vmx/tdx_arch.h diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index fc013c8816f1..4dea6e89fc69 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -2,6 +2,8 @@ #ifndef __KVM_X86_VMX_TDX_H #define __KVM_X86_VMX_TDX_H +#include "tdx_arch.h" + #ifdef CONFIG_INTEL_TDX_HOST int tdx_bringup(void); void tdx_cleanup(void); diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h new file mode 100644 index 000000000000..fb7abe9fef8e --- /dev/null +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -0,0 +1,123 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* architectural constants/data definitions for TDX SEAMCALLs */ + +#ifndef __KVM_X86_TDX_ARCH_H +#define __KVM_X86_TDX_ARCH_H + +#include + +/* TDX control structure (TDR/TDCS/TDVPS) field access codes */ +#define TDX_NON_ARCH BIT_ULL(63) +#define TDX_CLASS_SHIFT 56 +#define TDX_FIELD_MASK GENMASK_ULL(31, 0) + +#define __BUILD_TDX_FIELD(non_arch, class, field) \ + (((non_arch) ? TDX_NON_ARCH : 0) | \ + ((u64)(class) << TDX_CLASS_SHIFT) | \ + ((u64)(field) & TDX_FIELD_MASK)) + +#define BUILD_TDX_FIELD(class, field) \ + __BUILD_TDX_FIELD(false, (class), (field)) + +#define BUILD_TDX_FIELD_NON_ARCH(class, field) \ + __BUILD_TDX_FIELD(true, (class), (field)) + + +/* Class code for TD */ +#define TD_CLASS_EXECUTION_CONTROLS 17ULL + +/* Class code for TDVPS */ +#define TDVPS_CLASS_VMCS 0ULL +#define TDVPS_CLASS_GUEST_GPR 16ULL +#define TDVPS_CLASS_OTHER_GUEST 17ULL +#define TDVPS_CLASS_MANAGEMENT 32ULL + +enum tdx_tdcs_execution_control { + TD_TDCS_EXEC_TSC_OFFSET = 10, +}; + +/* @field is any of enum tdx_tdcs_execution_control */ +#define TDCS_EXEC(field) BUILD_TDX_FIELD(TD_CLASS_EXECUTION_CONTROLS, (field)) + +/* @field is the VMCS field encoding */ +#define TDVPS_VMCS(field) BUILD_TDX_FIELD(TDVPS_CLASS_VMCS, (field)) + +/* @field is any of enum tdx_guest_other_state */ +#define TDVPS_STATE(field) BUILD_TDX_FIELD(TDVPS_CLASS_OTHER_GUEST, (field)) +#define TDVPS_STATE_NON_ARCH(field) BUILD_TDX_FIELD_NON_ARCH(TDVPS_CLASS_OTHER_GUEST, (field)) + +/* Management class fields */ +enum tdx_vcpu_guest_management { + TD_VCPU_PEND_NMI = 11, +}; + +/* @field is any of enum tdx_vcpu_guest_management */ +#define TDVPS_MANAGEMENT(field) BUILD_TDX_FIELD(TDVPS_CLASS_MANAGEMENT, (field)) + +#define TDX_EXTENDMR_CHUNKSIZE 256 + +struct tdx_cpuid_value { + u32 eax; + u32 ebx; + u32 ecx; + u32 edx; +} __packed; + +#define TDX_TD_ATTR_DEBUG BIT_ULL(0) +#define TDX_TD_ATTR_SEPT_VE_DISABLE BIT_ULL(28) +#define TDX_TD_ATTR_PKS BIT_ULL(30) +#define TDX_TD_ATTR_KL BIT_ULL(31) +#define TDX_TD_ATTR_PERFMON BIT_ULL(63) + +/* + * TD_PARAMS is provided as an input to TDH_MNG_INIT, the size of which is 1024B. + */ +struct td_params { + u64 attributes; + u64 xfam; + u16 max_vcpus; + u8 reserved0[6]; + + u64 eptp_controls; + u64 config_flags; + u16 tsc_frequency; + u8 reserved1[38]; + + u64 mrconfigid[6]; + u64 mrowner[6]; + u64 mrownerconfig[6]; + u64 reserved2[4]; + + union { + DECLARE_FLEX_ARRAY(struct tdx_cpuid_value, cpuid_values); + u8 reserved3[768]; + }; +} __packed __aligned(1024); + +/* + * Guest uses MAX_PA for GPAW when set. + * 0: GPA.SHARED bit is GPA[47] + * 1: GPA.SHARED bit is GPA[51] + */ +#define TDX_CONFIG_FLAGS_MAX_GPAW BIT_ULL(0) + +/* + * TDH.VP.ENTER, TDG.VP.VMCALL preserves RBP + * 0: RBP can be used for TDG.VP.VMCALL input. RBP is clobbered. + * 1: RBP can't be used for TDG.VP.VMCALL input. RBP is preserved. + */ +#define TDX_CONFIG_FLAGS_NO_RBP_MOD BIT_ULL(2) + + +/* + * TDX requires the frequency to be defined in units of 25MHz, which is the + * frequency of the core crystal clock on TDX-capable platforms, i.e. the TDX + * module can only program frequencies that are multiples of 25MHz. The + * frequency must be between 100mhz and 10ghz (inclusive). + */ +#define TDX_TSC_KHZ_TO_25MHZ(tsc_in_khz) ((tsc_in_khz) / (25 * 1000)) +#define TDX_TSC_25MHZ_TO_KHZ(tsc_in_25mhz) ((tsc_in_25mhz) * (25 * 1000)) +#define TDX_MIN_TSC_FREQUENCY_KHZ (100 * 1000) +#define TDX_MAX_TSC_FREQUENCY_KHZ (10 * 1000 * 1000) + +#endif /* __KVM_X86_TDX_ARCH_H */ From patchwork Thu Feb 20 17:05:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984313 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D46A6217F54 for ; Thu, 20 Feb 2025 17:06:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071199; cv=none; b=qZQnBQfUxNK2c+NtSST0B+uNt2o0YUdd0Flc+HvUjeJa+VIf/DVmZv+jxDgLKqDL9YqdGBkyTN+f4nDno5VI/KIFDxLYqn6HKB48EJLn1SB6Al+1W8eY5y3qdXbDez+HGIfR686a3X9YI977KSBZ3ZtMpSt1kLUEHq4wkDJ1uPo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071199; c=relaxed/simple; bh=Cvk0fLAe+ttKnZPVjmBaJXVNqyzrbCujk7VB45yy+tk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QAArIildjoYdXITvwFuL4NgMs0sNDdPrNNRqVZ5URs3W7BWq/f2ye+8hl4f5SqrNLWX4yH8u1C3vA0VQZ+x88/awQTp+Lg8/p1NgmVdRxrSZJmUOp2MJGhEaB1qKWRVn9glWl8/5lXG0NGeLlVG2L/CgZTEV6l3kN1iN87k2sqs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=THcp4JvQ; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="THcp4JvQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071196; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+Mkpvq25cFtWQqE53LKozHj77nsCFaTRFqVovN6R0Ro=; b=THcp4JvQr9aCmWWlulTZMFtp3af/upDgq8eT5yc424WF7w9FKg1f4SFWCE7vLjgVKirgQl 9oabY2zBmpOJA51zQhBUv56GVgAcrSygP5IuIY9b+SUL+ELu9XmI82OGQolxNf/S46nLd8 xwunvL1BldQHvgNQ7fKJ27Ow+j683K0= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-101-DlPU8oVjM86JldCoSThOTA-1; Thu, 20 Feb 2025 12:06:33 -0500 X-MC-Unique: DlPU8oVjM86JldCoSThOTA-1 X-Mimecast-MFC-AGG-ID: DlPU8oVjM86JldCoSThOTA_1740071190 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CF0DF1801A15; Thu, 20 Feb 2025 17:06:30 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 84E1E180035F; Thu, 20 Feb 2025 17:06:29 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Sean Christopherson , Isaku Yamahata , Yuan Yao , Xiaoyao Li Subject: [PATCH 16/30] KVM: TDX: Add TDX "architectural" error codes Date: Thu, 20 Feb 2025 12:05:50 -0500 Message-ID: <20250220170604.2279312-17-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Sean Christopherson Add error codes for the TDX SEAMCALLs both for TDX VMM side for TDH SEAMCALL and TDX guest side for TDG.VP.VMCALL. KVM issues the TDX SEAMCALLs and checks its error code. KVM handles hypercall from the TDX guest and may return an error. So error code for the TDX guest is also needed. TDX SEAMCALL uses bits 31:0 to return more information, so these error codes will only exactly match RAX[63:32]. Error codes for TDG.VP.VMCALL is defined by TDX Guest-Host-Communication interface spec. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe Reviewed-by: Paolo Bonzini Reviewed-by: Yuan Yao Reviewed-by: Xiaoyao Li Message-ID: <20241030190039.77971-14-rick.p.edgecombe@intel.com> Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/shared/tdx.h | 7 +++++- arch/x86/kvm/vmx/tdx.h | 1 + arch/x86/kvm/vmx/tdx_errno.h | 36 +++++++++++++++++++++++++++++++ 3 files changed, 43 insertions(+), 1 deletion(-) create mode 100644 arch/x86/kvm/vmx/tdx_errno.h diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h index fcbbef484a78..4aedab1f2a1a 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -71,7 +71,12 @@ #define TDVMCALL_GET_QUOTE 0x10002 #define TDVMCALL_REPORT_FATAL_ERROR 0x10003 -#define TDVMCALL_STATUS_RETRY 1 +/* + * TDG.VP.VMCALL Status Codes (returned in R10) + */ +#define TDVMCALL_STATUS_SUCCESS 0x0000000000000000ULL +#define TDVMCALL_STATUS_RETRY 0x0000000000000001ULL +#define TDVMCALL_STATUS_INVALID_OPERAND 0x8000000000000000ULL /* * Bitmasks of exposed registers (with VMM). diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 4dea6e89fc69..abe77ceddb9f 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -3,6 +3,7 @@ #define __KVM_X86_VMX_TDX_H #include "tdx_arch.h" +#include "tdx_errno.h" #ifdef CONFIG_INTEL_TDX_HOST int tdx_bringup(void); diff --git a/arch/x86/kvm/vmx/tdx_errno.h b/arch/x86/kvm/vmx/tdx_errno.h new file mode 100644 index 000000000000..dc3fa2a58c2c --- /dev/null +++ b/arch/x86/kvm/vmx/tdx_errno.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* architectural status code for SEAMCALL */ + +#ifndef __KVM_X86_TDX_ERRNO_H +#define __KVM_X86_TDX_ERRNO_H + +#define TDX_SEAMCALL_STATUS_MASK 0xFFFFFFFF00000000ULL + +/* + * TDX SEAMCALL Status Codes (returned in RAX) + */ +#define TDX_NON_RECOVERABLE_VCPU 0x4000000100000000ULL +#define TDX_INTERRUPTED_RESUMABLE 0x8000000300000000ULL +#define TDX_OPERAND_INVALID 0xC000010000000000ULL +#define TDX_OPERAND_BUSY 0x8000020000000000ULL +#define TDX_PREVIOUS_TLB_EPOCH_BUSY 0x8000020100000000ULL +#define TDX_PAGE_METADATA_INCORRECT 0xC000030000000000ULL +#define TDX_VCPU_NOT_ASSOCIATED 0x8000070200000000ULL +#define TDX_KEY_GENERATION_FAILED 0x8000080000000000ULL +#define TDX_KEY_STATE_INCORRECT 0xC000081100000000ULL +#define TDX_KEY_CONFIGURED 0x0000081500000000ULL +#define TDX_NO_HKID_READY_TO_WBCACHE 0x0000082100000000ULL +#define TDX_FLUSHVP_NOT_DONE 0x8000082400000000ULL +#define TDX_EPT_WALK_FAILED 0xC0000B0000000000ULL +#define TDX_EPT_ENTRY_STATE_INCORRECT 0xC0000B0D00000000ULL + +/* + * TDX module operand ID, appears in 31:0 part of error code as + * detail information + */ +#define TDX_OPERAND_ID_RCX 0x01 +#define TDX_OPERAND_ID_TDR 0x80 +#define TDX_OPERAND_ID_SEPT 0x92 +#define TDX_OPERAND_ID_TD_EPOCH 0xa9 + +#endif /* __KVM_X86_TDX_ERRNO_H */ From patchwork Thu Feb 20 17:05:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984320 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B47042206A4 for ; Thu, 20 Feb 2025 17:06:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071210; cv=none; b=V3jfqxSX3kVk5PfIexTbBqSyyPy0/hLyb/MmaOFm+4i36HAYmCdbro92mfiyUZTKY3eJsEgtOuSZNaiY9x3XP720HddVc/ldC93qzyPwE/6e+AVWJ4PoMFaG2BRI7OylwmuqBndiPzf4MFqVRpdz5y9yeVkj/Ao5uqJ3E8Xmboo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071210; c=relaxed/simple; bh=1I701KyNSGUm8RYQFeoQ3GgMoqE3LdQNgtYUk6q86q8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=r/TKluvzgUjovucNll2X2lHXxTmj9u6Xoesor8bRblPr+yBkc+lWy6eraF1MFBchQ/Je1UElnQTitmeUHwhFp+UUb+FJkvEtOaU1/hCZogQ7DjEoEwvMWkQPCZaKtW2z9z0DvHSrUvJGUkGDYmGQZ/NE08JKAVKGT/TzZT58AYQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WQFS7qDy; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WQFS7qDy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071207; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jxfc0QDdR7NUvWcoQzVsBBRdWvI7T7CZJvQ3sHjQves=; b=WQFS7qDy7lI35ZELZraMxx1x1WQuS+CRL9DSPXFzP/APHe0aRsckP8sM0U0ZebEpqOgmoO qAccGYtnrl46Ai9kR5NehZg53WkMx7N/7WLZvBL6T/322ZChfEF6IjgcIftvglzEEl8sLl ssuirxzg1JfPlyTlj+ZSxO0LeuJ3ZPM= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-647-8RdNyRGDNZuWfFeTJ7S94w-1; Thu, 20 Feb 2025 12:06:33 -0500 X-MC-Unique: 8RdNyRGDNZuWfFeTJ7S94w-1 X-Mimecast-MFC-AGG-ID: 8RdNyRGDNZuWfFeTJ7S94w_1740071192 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 60A6E19357B1; Thu, 20 Feb 2025 17:06:32 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 10B45180035F; Thu, 20 Feb 2025 17:06:30 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Tony Lindgren , Binbin Wu , Yuan Yao Subject: [PATCH 17/30] KVM: TDX: Add helper functions to print TDX SEAMCALL error Date: Thu, 20 Feb 2025 12:05:51 -0500 Message-ID: <20250220170604.2279312-18-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Isaku Yamahata Add helper functions to print out errors from the TDX module in a uniform manner. Signed-off-by: Isaku Yamahata Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Reviewed-by: Yuan Yao Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index a830e9c42bf2..1f4875e5f7c9 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -10,6 +10,21 @@ #undef pr_fmt #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#define pr_tdx_error(__fn, __err) \ + pr_err_ratelimited("SEAMCALL %s failed: 0x%llx\n", #__fn, __err) + +#define __pr_tdx_error_N(__fn_str, __err, __fmt, ...) \ + pr_err_ratelimited("SEAMCALL " __fn_str " failed: 0x%llx, " __fmt, __err, __VA_ARGS__) + +#define pr_tdx_error_1(__fn, __err, __rcx) \ + __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx\n", __rcx) + +#define pr_tdx_error_2(__fn, __err, __rcx, __rdx) \ + __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx, rdx 0x%llx\n", __rcx, __rdx) + +#define pr_tdx_error_3(__fn, __err, __rcx, __rdx, __r8) \ + __pr_tdx_error_N(#__fn, __err, "rcx 0x%llx, rdx 0x%llx, r8 0x%llx\n", __rcx, __rdx, __r8) + bool enable_tdx __ro_after_init; module_param_named(tdx, enable_tdx, bool, 0444); From patchwork Thu Feb 20 17:05:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984318 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BFDC21D5A9 for ; Thu, 20 Feb 2025 17:06:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071208; cv=none; b=sDPno6IYRetdp/WTlXssaO5NxpYtijAF0MeCmRDHQjPjzeGIPMf8djatgGEsiBv82UexZ0A+tWL9LFMHO8M3mrhqcsmkymFh0vFraNl+lQoei8Jv6Xbe5ZqySQ6k8poQF2T6xNDlqBc9TiIemkNScFcPrKKnK0OROxfcchJTTHA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071208; c=relaxed/simple; bh=3bbbPtfHcGDQsrY8sxyTOgmbyvMh9I+RviSMH3t2ZAo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MOHzAFHWTw4LOKmIdTYfjNtxzcE6R14xkoGNBvViSZEPE3ofGTOxuwEsM0mzZ+s5H8ZHZLfCkJ4O/f3kWTkE6DMjQ2btiixSGwmeR/mv6QEcA2BVOHO68VV5cZ6qSpZYR6zuDtP1BncceKkSHZNdyPyBwWsTGpipqXJVulV7o6E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=RXCtUbyd; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RXCtUbyd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071205; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AeZoBHdL9jkJ4w73bjz9IXMkss9N9nVKm83Iytboi+M=; b=RXCtUbydOt86+dV0U3rEg/9PcWS6OnOskBm4NY42unzEgWycClBcve8hno8HR9idzQipFS P28qI+rOqJFil4HZp4texF/yrJCfduz8D7cVGduKQn0pH5G5Pf8LL/GExtrwawoBlDYNWz 2fjbXx3172V1La4rqYyeILVV/osQR7U= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-48-zkq8OD_XPyy2kF7vSkgrAg-1; Thu, 20 Feb 2025 12:06:35 -0500 X-MC-Unique: zkq8OD_XPyy2kF7vSkgrAg-1 X-Mimecast-MFC-AGG-ID: zkq8OD_XPyy2kF7vSkgrAg_1740071193 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A6CC7180087A; Thu, 20 Feb 2025 17:06:33 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 942691800359; Thu, 20 Feb 2025 17:06:32 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Tony Lindgren Subject: [PATCH 18/30] KVM: TDX: Add place holder for TDX VM specific mem_enc_op ioctl Date: Thu, 20 Feb 2025 12:05:52 -0500 Message-ID: <20250220170604.2279312-19-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Isaku Yamahata KVM_MEMORY_ENCRYPT_OP was introduced for VM-scoped operations specific for guest state-protected VM. It defined subcommands for technology-specific operations under KVM_MEMORY_ENCRYPT_OP. Despite its name, the subcommands are not limited to memory encryption, but various technology-specific operations are defined. It's natural to repurpose KVM_MEMORY_ENCRYPT_OP for TDX specific operations and define subcommands. Add a place holder function for TDX specific VM-scoped ioctl as mem_enc_op. TDX specific sub-commands will be added to retrieve/pass TDX specific parameters. Make mem_enc_ioctl non-optional as it's always filled. Signed-off-by: Isaku Yamahata Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Signed-off-by: Rick Edgecombe --- - Drop the misleading "defined for consistency" line. It's a copy-paste error introduced in the earlier patches. Earlier there was padding at the end to match struct kvm_sev_cmd size. (Tony) Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm-x86-ops.h | 2 +- arch/x86/include/uapi/asm/kvm.h | 23 +++++++++++++++++++++ arch/x86/kvm/vmx/main.c | 10 ++++++++++ arch/x86/kvm/vmx/tdx.c | 32 ++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 6 ++++++ arch/x86/kvm/x86.c | 4 ---- 6 files changed, 72 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 823c0434bbad..1eca04087cf4 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -125,7 +125,7 @@ KVM_X86_OP(leave_smm) KVM_X86_OP(enable_smi_window) #endif KVM_X86_OP_OPTIONAL(dev_get_attr) -KVM_X86_OP_OPTIONAL(mem_enc_ioctl) +KVM_X86_OP(mem_enc_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_register_region) KVM_X86_OP_OPTIONAL(mem_enc_unregister_region) KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 9e75da97bce0..2b0317b47e47 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -927,4 +927,27 @@ struct kvm_hyperv_eventfd { #define KVM_X86_SNP_VM 4 #define KVM_X86_TDX_VM 5 +/* Trust Domain eXtension sub-ioctl() commands. */ +enum kvm_tdx_cmd_id { + KVM_TDX_CMD_NR_MAX, +}; + +struct kvm_tdx_cmd { + /* enum kvm_tdx_cmd_id */ + __u32 id; + /* flags for sub-commend. If sub-command doesn't use this, set zero. */ + __u32 flags; + /* + * data for each sub-command. An immediate or a pointer to the actual + * data in process virtual address. If sub-command doesn't use it, + * set zero. + */ + __u64 data; + /* + * Auxiliary error code. The sub-command may return TDX SEAMCALL + * status code in addition to -Exxx. + */ + __u64 hw_error; +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 5c1acc88feef..69c65085f81a 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -41,6 +41,14 @@ static __init int vt_hardware_setup(void) return 0; } +static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) +{ + if (!is_td(kvm)) + return -ENOTTY; + + return tdx_vm_ioctl(kvm, argp); +} + #define VMX_REQUIRED_APICV_INHIBITS \ (BIT(APICV_INHIBIT_REASON_DISABLED) | \ BIT(APICV_INHIBIT_REASON_ABSENT) | \ @@ -193,6 +201,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector, .get_untagged_addr = vmx_get_untagged_addr, + + .mem_enc_ioctl = vt_mem_enc_ioctl, }; struct kvm_x86_init_ops vt_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 1f4875e5f7c9..08880ac9c77b 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3,6 +3,7 @@ #include #include #include "capabilities.h" +#include "x86_ops.h" #include "tdx.h" #pragma GCC poison to_vmx @@ -42,6 +43,37 @@ static __always_inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_tdx, vcpu); } +int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) +{ + struct kvm_tdx_cmd tdx_cmd; + int r; + + if (copy_from_user(&tdx_cmd, argp, sizeof(struct kvm_tdx_cmd))) + return -EFAULT; + + /* + * Userspace should never set hw_error. It is used to fill + * hardware-defined error by the kernel. + */ + if (tdx_cmd.hw_error) + return -EINVAL; + + mutex_lock(&kvm->lock); + + switch (tdx_cmd.id) { + default: + r = -EINVAL; + goto out; + } + + if (copy_to_user(argp, &tdx_cmd, sizeof(struct kvm_tdx_cmd))) + r = -EFAULT; + +out: + mutex_unlock(&kvm->lock); + return r; +} + static int tdx_online_cpu(unsigned int cpu) { unsigned long flags; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 430773a5ef8e..2c87a8e7bddc 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -121,4 +121,10 @@ void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu); #endif void vmx_setup_mce(struct kvm_vcpu *vcpu); +#ifdef CONFIG_INTEL_TDX_HOST +int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); +#else +static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } +#endif + #endif /* __KVM_X86_VMX_X86_OPS_H */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 02159c967d29..374d89e6663f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7277,10 +7277,6 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) goto out; } case KVM_MEMORY_ENCRYPT_OP: { - r = -ENOTTY; - if (!kvm_x86_ops.mem_enc_ioctl) - goto out; - r = kvm_x86_call(mem_enc_ioctl)(kvm, argp); break; } From patchwork Thu Feb 20 17:05:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984314 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 843E921A43D for ; Thu, 20 Feb 2025 17:06:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071203; cv=none; b=ZmdHtIDndja5EvygrTmpqxG41g4PJA0Puh1b/LOXME8tRouhmbZDbilTrERCitI+sooMUUmr0a0c6IyEjMfJdPtggrhlqK5fNAxz6AxD1TP4b8Cl36kIy81JKSBZTVFqpQUmzWz9qP4+oTo0CvZTuxB8Ieha5t3kU3PS3rRey2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071203; c=relaxed/simple; bh=HbWCV+mETAHWTKHHBWZkcZUlSZ1oSOoV/q0RgCTLReg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=F+4Bx4QJwJbpIQiajZficZM6BMREANaf9aSF2KoChwNdTCNnn6SlvtLCvGM835KtkxMOe3WsOetVM+h0072XJC8VM8L2E2VuG/okpD7QKtIpJkKXaF/gsZDcELAKk4gDfLA+zg1R0Uq1CSuCcKNeFlktwDyjNk6b8rWf4O5tUeU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EBO95XyS; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EBO95XyS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071200; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hq1OkvKb2vekpPdZQfh7i/EWk4fkjhdAozs6dxCFkao=; b=EBO95XyS2QuIXFI1kpmEG31KTkrXhvkA1h3q8spoXRqTK1LvHROIY/oK5iMjUzxKL1McPc sQHi9tUXZadug9+y/7F/tWUyNvfngVrtAMTKJzT7NXvvjWn46CJebbOQj8XBTLjzQme1Nj tVxoS1dvBg4NgEdlglxIqS3KvgZ+jH8= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-679-cjY7w19INPKFo5KBJuPaZQ-1; Thu, 20 Feb 2025 12:06:36 -0500 X-MC-Unique: cjY7w19INPKFo5KBJuPaZQ-1 X-Mimecast-MFC-AGG-ID: cjY7w19INPKFo5KBJuPaZQ_1740071195 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3340B18EAB38; Thu, 20 Feb 2025 17:06:35 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DB70B1800D9E; Thu, 20 Feb 2025 17:06:33 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Xiaoyao Li , Tony Lindgren , Binbin Wu Subject: [PATCH 19/30] KVM: TDX: Get system-wide info about TDX module on initialization Date: Thu, 20 Feb 2025 12:05:53 -0500 Message-ID: <20250220170604.2279312-20-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 From: Isaku Yamahata TDX KVM needs system-wide information about the TDX module. Generate the data based on tdx_sysinfo td_conf CPUID data. Signed-off-by: Isaku Yamahata Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu --- - Clarify comment about EAX[23:16] in td_init_cpuid_entry2() (Xiaoyao) - Add comment for configurable CPUID bits (Xiaoyao) Signed-off-by: Paolo Bonzini --- arch/x86/include/uapi/asm/kvm.h | 11 +++ arch/x86/kvm/vmx/tdx.c | 137 ++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx_arch.h | 2 + 3 files changed, 150 insertions(+) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 2b0317b47e47..8a4633cdb247 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -929,6 +929,8 @@ struct kvm_hyperv_eventfd { /* Trust Domain eXtension sub-ioctl() commands. */ enum kvm_tdx_cmd_id { + KVM_TDX_CAPABILITIES = 0, + KVM_TDX_CMD_NR_MAX, }; @@ -950,4 +952,13 @@ struct kvm_tdx_cmd { __u64 hw_error; }; +struct kvm_tdx_capabilities { + __u64 supported_attrs; + __u64 supported_xfam; + __u64 reserved[254]; + + /* Configurable CPUID bits for userspace */ + struct kvm_cpuid2 cpuid; +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 08880ac9c77b..630d19114429 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -33,6 +33,8 @@ static enum cpuhp_state tdx_cpuhp_state; static const struct tdx_sys_info *tdx_sysinfo; +#define KVM_SUPPORTED_TD_ATTRS (TDX_TD_ATTR_SEPT_VE_DISABLE) + static __always_inline struct kvm_tdx *to_kvm_tdx(struct kvm *kvm) { return container_of(kvm, struct kvm_tdx, kvm); @@ -43,6 +45,129 @@ static __always_inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_tdx, vcpu); } +static u64 tdx_get_supported_attrs(const struct tdx_sys_info_td_conf *td_conf) +{ + u64 val = KVM_SUPPORTED_TD_ATTRS; + + if ((val & td_conf->attributes_fixed1) != td_conf->attributes_fixed1) + return 0; + + val &= td_conf->attributes_fixed0; + + return val; +} + +static u64 tdx_get_supported_xfam(const struct tdx_sys_info_td_conf *td_conf) +{ + u64 val = kvm_caps.supported_xcr0 | kvm_caps.supported_xss; + + if ((val & td_conf->xfam_fixed1) != td_conf->xfam_fixed1) + return 0; + + val &= td_conf->xfam_fixed0; + + return val; +} + +static u32 tdx_set_guest_phys_addr_bits(const u32 eax, int addr_bits) +{ + return (eax & ~GENMASK(23, 16)) | (addr_bits & 0xff) << 16; +} + +#define KVM_TDX_CPUID_NO_SUBLEAF ((__u32)-1) + +static void td_init_cpuid_entry2(struct kvm_cpuid_entry2 *entry, unsigned char idx) +{ + const struct tdx_sys_info_td_conf *td_conf = &tdx_sysinfo->td_conf; + + entry->function = (u32)td_conf->cpuid_config_leaves[idx]; + entry->index = td_conf->cpuid_config_leaves[idx] >> 32; + entry->eax = (u32)td_conf->cpuid_config_values[idx][0]; + entry->ebx = td_conf->cpuid_config_values[idx][0] >> 32; + entry->ecx = (u32)td_conf->cpuid_config_values[idx][1]; + entry->edx = td_conf->cpuid_config_values[idx][1] >> 32; + + if (entry->index == KVM_TDX_CPUID_NO_SUBLEAF) + entry->index = 0; + + /* + * The TDX module doesn't allow configuring the guest phys addr bits + * (EAX[23:16]). However, KVM uses it as an interface to the userspace + * to configure the GPAW. Report these bits as configurable. + */ + if (entry->function == 0x80000008) + entry->eax = tdx_set_guest_phys_addr_bits(entry->eax, 0xff); +} + +static int init_kvm_tdx_caps(const struct tdx_sys_info_td_conf *td_conf, + struct kvm_tdx_capabilities *caps) +{ + int i; + + caps->supported_attrs = tdx_get_supported_attrs(td_conf); + if (!caps->supported_attrs) + return -EIO; + + caps->supported_xfam = tdx_get_supported_xfam(td_conf); + if (!caps->supported_xfam) + return -EIO; + + caps->cpuid.nent = td_conf->num_cpuid_config; + + for (i = 0; i < td_conf->num_cpuid_config; i++) + td_init_cpuid_entry2(&caps->cpuid.entries[i], i); + + return 0; +} + +static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd) +{ + const struct tdx_sys_info_td_conf *td_conf = &tdx_sysinfo->td_conf; + struct kvm_tdx_capabilities __user *user_caps; + struct kvm_tdx_capabilities *caps = NULL; + int ret = 0; + + /* flags is reserved for future use */ + if (cmd->flags) + return -EINVAL; + + caps = kmalloc(sizeof(*caps) + + sizeof(struct kvm_cpuid_entry2) * td_conf->num_cpuid_config, + GFP_KERNEL); + if (!caps) + return -ENOMEM; + + user_caps = u64_to_user_ptr(cmd->data); + if (copy_from_user(caps, user_caps, sizeof(*caps))) { + ret = -EFAULT; + goto out; + } + + if (caps->cpuid.nent < td_conf->num_cpuid_config) { + ret = -E2BIG; + goto out; + } + + ret = init_kvm_tdx_caps(td_conf, caps); + if (ret) + goto out; + + if (copy_to_user(user_caps, caps, sizeof(*caps))) { + ret = -EFAULT; + goto out; + } + + if (copy_to_user(user_caps->cpuid.entries, caps->cpuid.entries, + caps->cpuid.nent * + sizeof(caps->cpuid.entries[0]))) + ret = -EFAULT; + +out: + /* kfree() accepts NULL. */ + kfree(caps); + return ret; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -61,6 +186,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) mutex_lock(&kvm->lock); switch (tdx_cmd.id) { + case KVM_TDX_CAPABILITIES: + r = tdx_get_capabilities(&tdx_cmd); + break; default: r = -EINVAL; goto out; @@ -158,11 +286,20 @@ static int __init __tdx_bringup(void) goto get_sysinfo_err; } + /* Check TDX module and KVM capabilities */ + if (!tdx_get_supported_attrs(&tdx_sysinfo->td_conf) || + !tdx_get_supported_xfam(&tdx_sysinfo->td_conf)) + goto get_sysinfo_err; + + if (!(tdx_sysinfo->features.tdx_features0 & MD_FIELD_ID_FEATURES0_TOPOLOGY_ENUM)) + goto get_sysinfo_err; + /* * Leave hardware virtualization enabled after TDX is enabled * successfully. TDX CPU hotplug depends on this. */ return 0; + get_sysinfo_err: __do_tdx_cleanup(); tdx_bringup_err: diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index fb7abe9fef8e..cb9a638fa398 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -120,4 +120,6 @@ struct td_params { #define TDX_MIN_TSC_FREQUENCY_KHZ (100 * 1000) #define TDX_MAX_TSC_FREQUENCY_KHZ (10 * 1000 * 1000) +#define MD_FIELD_ID_FEATURES0_TOPOLOGY_ENUM BIT_ULL(20) + #endif /* __KVM_X86_TDX_ARCH_H */ From patchwork Thu Feb 20 17:05:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984317 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CAB621CC7D for ; Thu, 20 Feb 2025 17:06:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071208; cv=none; b=EkaSUs7pdjEJGjwi9FyH5elWAdPr5ktk5sXMtDe36FTRE2QsGgU7olrcWA5iARYZXcyF04k1+dStd79j8848pp2umdzBLEVCPAWV8fcydlWCmmL9xnT1oR2Dg7Bm0awqWyuD8/7BabN57Fl/amFoIXQTm30QDRB3YOTE4mhexHw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071208; c=relaxed/simple; bh=VKFp8WlSLyYiESTl1y/92aIdbI/52iPf9W6R6cK3Tl4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=obK9+e2hHbydVRA5ZHReab/+gxMBiAAbfs0G5R43FDZ7suEjHk0a02NSkvKAWbSZtLUfqvwD8U9gHs0L9RlkW82kCEKwZ6I+N6gMBgVGHvw8e3f1tQih4uWfpef0QMK3IaR+r5DpXN5FXkxnl6RVBENyKZXrsTS0IcmS6WECRSE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=BmtdaBAw; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BmtdaBAw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071204; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ov2adn0i+9E8/WF3+1VTErS7R/vkRGsNMQFp+o/lZQc=; b=BmtdaBAwbNj2lZve3sgCVLCB5phULUX8upPA6cNpkDd8yzbr2Es9YEmz6uujY8UzJynU+0 tIFEqUEfxrWD+WV3KOUiSof7DQfPdw9FLEeOLNMj2994iVk5qzamJwl1rXegyPLbB2oE+e lZmcLQ0FTkE2DYKbeKfwPzczXFcYlP8= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-685-RG1BcHj1NlC4QTJKnp9AnQ-1; Thu, 20 Feb 2025 12:06:39 -0500 X-MC-Unique: RG1BcHj1NlC4QTJKnp9AnQ-1 X-Mimecast-MFC-AGG-ID: RG1BcHj1NlC4QTJKnp9AnQ_1740071197 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5FCC31800873; Thu, 20 Feb 2025 17:06:37 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 082BD19412A3; Thu, 20 Feb 2025 17:06:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Tony Lindgren , Sean Christopherson , Kai Huang Subject: [PATCH 20/30] KVM: TDX: create/destroy VM structure Date: Thu, 20 Feb 2025 12:05:54 -0500 Message-ID: <20250220170604.2279312-21-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Isaku Yamahata Implement managing the TDX private KeyID to implement, create, destroy and free for a TDX guest. When creating at TDX guest, assign a TDX private KeyID for the TDX guest for memory encryption, and allocate pages for the guest. These are used for the Trust Domain Root (TDR) and Trust Domain Control Structure (TDCS). On destruction, free the allocated pages, and the KeyID. Before tearing down the private page tables, TDX requires the guest TD to be destroyed by reclaiming the KeyID. Do it at vm_destroy() kvm_x86_ops hook. Add a call for vm_free() at the end of kvm_arch_destroy_vm() because the per-VM TDR needs to be freed after the KeyID. Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Signed-off-by: Isaku Yamahata Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Co-developed-by: Kai Huang Signed-off-by: Kai Huang Co-developed-by: Yan Zhao Signed-off-by: Yan Zhao Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe --- - Fix build issue in kvm-coco-queue - Init ret earlier to fix __tdx_td_init() error handling. (Chao) - Standardize -EAGAIN for __tdx_td_init() retry errors (Rick) Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/Kconfig | 2 + arch/x86/kvm/vmx/main.c | 28 +- arch/x86/kvm/vmx/tdx.c | 450 +++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 4 +- arch/x86/kvm/vmx/x86_ops.h | 6 + arch/x86/kvm/x86.c | 1 + 8 files changed, 490 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 1eca04087cf4..bb909e6125b4 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -21,6 +21,7 @@ KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) KVM_X86_OP(vm_init) KVM_X86_OP_OPTIONAL(vm_destroy) +KVM_X86_OP_OPTIONAL(vm_free) KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate) KVM_X86_OP(vcpu_create) KVM_X86_OP(vcpu_free) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0b7af5902ff7..831b20830a09 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1663,6 +1663,7 @@ struct kvm_x86_ops { unsigned int vm_size; int (*vm_init)(struct kvm *kvm); void (*vm_destroy)(struct kvm *kvm); + void (*vm_free)(struct kvm *kvm); /* Create, but do not attach this VCPU */ int (*vcpu_precreate)(struct kvm *kvm); diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index fe8cbee6f614..0d445a317f61 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -94,6 +94,8 @@ config KVM_SW_PROTECTED_VM config KVM_INTEL tristate "KVM for Intel (and compatible) processors support" depends on KVM && IA32_FEAT_CTL + select KVM_GENERIC_PRIVATE_MEM if INTEL_TDX_HOST + select KVM_GENERIC_MEMORY_ATTRIBUTES if INTEL_TDX_HOST help Provides support for KVM on processors equipped with Intel's VT extensions, a.k.a. Virtual Machine Extensions (VMX). diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 69c65085f81a..b655a5051c13 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -41,6 +41,28 @@ static __init int vt_hardware_setup(void) return 0; } +static int vt_vm_init(struct kvm *kvm) +{ + if (is_td(kvm)) + return tdx_vm_init(kvm); + + return vmx_vm_init(kvm); +} + +static void vt_vm_destroy(struct kvm *kvm) +{ + if (is_td(kvm)) + return tdx_mmu_release_hkid(kvm); + + vmx_vm_destroy(kvm); +} + +static void vt_vm_free(struct kvm *kvm) +{ + if (is_td(kvm)) + tdx_vm_free(kvm); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -72,8 +94,10 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .has_emulated_msr = vmx_has_emulated_msr, .vm_size = sizeof(struct kvm_vmx), - .vm_init = vmx_vm_init, - .vm_destroy = vmx_vm_destroy, + + .vm_init = vt_vm_init, + .vm_destroy = vt_vm_destroy, + .vm_free = vt_vm_free, .vcpu_precreate = vmx_vcpu_precreate, .vcpu_create = vmx_vcpu_create, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 630d19114429..c270e2109d7d 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -120,6 +120,280 @@ static int init_kvm_tdx_caps(const struct tdx_sys_info_td_conf *td_conf, return 0; } +/* + * Some SEAMCALLs acquire the TDX module globally, and can fail with + * TDX_OPERAND_BUSY. Use a global mutex to serialize these SEAMCALLs. + */ +static DEFINE_MUTEX(tdx_lock); + +/* Maximum number of retries to attempt for SEAMCALLs. */ +#define TDX_SEAMCALL_RETRIES 10000 + +static inline void tdx_hkid_free(struct kvm_tdx *kvm_tdx) +{ + tdx_guest_keyid_free(kvm_tdx->hkid); + kvm_tdx->hkid = -1; +} + +static inline bool is_hkid_assigned(struct kvm_tdx *kvm_tdx) +{ + return kvm_tdx->hkid > 0; +} + +static void tdx_clear_page(struct page *page) +{ + const void *zero_page = (const void *) page_to_virt(ZERO_PAGE(0)); + void *dest = page_to_virt(page); + unsigned long i; + + /* + * The page could have been poisoned. MOVDIR64B also clears + * the poison bit so the kernel can safely use the page again. + */ + for (i = 0; i < PAGE_SIZE; i += 64) + movdir64b(dest + i, zero_page); + /* + * MOVDIR64B store uses WC buffer. Prevent following memory reads + * from seeing potentially poisoned cache. + */ + __mb(); +} + +/* TDH.PHYMEM.PAGE.RECLAIM is allowed only when destroying the TD. */ +static int __tdx_reclaim_page(struct page *page) +{ + u64 err, rcx, rdx, r8; + int i; + + for (i = TDX_SEAMCALL_RETRIES; i > 0; i--) { + err = tdh_phymem_page_reclaim(page, &rcx, &rdx, &r8); + + /* + * TDH.PHYMEM.PAGE.RECLAIM is allowed only when TD is shutdown. + * state. i.e. destructing TD. + * TDH.PHYMEM.PAGE.RECLAIM requires TDR and target page. + * Because we're destructing TD, it's rare to contend with TDR. + */ + switch (err) { + case TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX: + case TDX_OPERAND_BUSY | TDX_OPERAND_ID_TDR: + cond_resched(); + continue; + default: + goto out; + } + } + +out: + if (WARN_ON_ONCE(err)) { + pr_tdx_error_3(TDH_PHYMEM_PAGE_RECLAIM, err, rcx, rdx, r8); + return -EIO; + } + return 0; +} + +static int tdx_reclaim_page(struct page *page) +{ + int r; + + r = __tdx_reclaim_page(page); + if (!r) + tdx_clear_page(page); + return r; +} + + +/* + * Reclaim the TD control page(s) which are crypto-protected by TDX guest's + * private KeyID. Assume the cache associated with the TDX private KeyID has + * been flushed. + */ +static void tdx_reclaim_control_page(struct page *ctrl_page) +{ + /* + * Leak the page if the kernel failed to reclaim the page. + * The kernel cannot use it safely anymore. + */ + if (tdx_reclaim_page(ctrl_page)) + return; + + __free_page(ctrl_page); +} + +static void smp_func_do_phymem_cache_wb(void *unused) +{ + u64 err = 0; + bool resume; + int i; + + /* + * TDH.PHYMEM.CACHE.WB flushes caches associated with any TDX private + * KeyID on the package or core. The TDX module may not finish the + * cache flush but return TDX_INTERRUPTED_RESUMEABLE instead. The + * kernel should retry it until it returns success w/o rescheduling. + */ + for (i = TDX_SEAMCALL_RETRIES; i > 0; i--) { + resume = !!err; + err = tdh_phymem_cache_wb(resume); + switch (err) { + case TDX_INTERRUPTED_RESUMABLE: + continue; + case TDX_NO_HKID_READY_TO_WBCACHE: + err = TDX_SUCCESS; /* Already done by other thread */ + fallthrough; + default: + goto out; + } + } + +out: + if (WARN_ON_ONCE(err)) + pr_tdx_error(TDH_PHYMEM_CACHE_WB, err); +} + +void tdx_mmu_release_hkid(struct kvm *kvm) +{ + bool packages_allocated, targets_allocated; + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + cpumask_var_t packages, targets; + u64 err; + int i; + + if (!is_hkid_assigned(kvm_tdx)) + return; + + /* KeyID has been allocated but guest is not yet configured */ + if (!kvm_tdx->td.tdr_page) { + tdx_hkid_free(kvm_tdx); + return; + } + + packages_allocated = zalloc_cpumask_var(&packages, GFP_KERNEL); + targets_allocated = zalloc_cpumask_var(&targets, GFP_KERNEL); + cpus_read_lock(); + + /* + * TDH.PHYMEM.CACHE.WB tries to acquire the TDX module global lock + * and can fail with TDX_OPERAND_BUSY when it fails to get the lock. + * Multiple TDX guests can be destroyed simultaneously. Take the + * mutex to prevent it from getting error. + */ + mutex_lock(&tdx_lock); + + /* + * Releasing HKID is in vm_destroy(). + * After the above flushing vps, there should be no more vCPU + * associations, as all vCPU fds have been released at this stage. + */ + for_each_online_cpu(i) { + if (packages_allocated && + cpumask_test_and_set_cpu(topology_physical_package_id(i), + packages)) + continue; + if (targets_allocated) + cpumask_set_cpu(i, targets); + } + if (targets_allocated) + on_each_cpu_mask(targets, smp_func_do_phymem_cache_wb, NULL, true); + else + on_each_cpu(smp_func_do_phymem_cache_wb, NULL, true); + /* + * In the case of error in smp_func_do_phymem_cache_wb(), the following + * tdh_mng_key_freeid() will fail. + */ + err = tdh_mng_key_freeid(&kvm_tdx->td); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MNG_KEY_FREEID, err); + pr_err("tdh_mng_key_freeid() failed. HKID %d is leaked.\n", + kvm_tdx->hkid); + } else { + tdx_hkid_free(kvm_tdx); + } + + mutex_unlock(&tdx_lock); + cpus_read_unlock(); + free_cpumask_var(targets); + free_cpumask_var(packages); +} + +static void tdx_reclaim_td_control_pages(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + u64 err; + int i; + + /* + * tdx_mmu_release_hkid() failed to reclaim HKID. Something went wrong + * heavily with TDX module. Give up freeing TD pages. As the function + * already warned, don't warn it again. + */ + if (is_hkid_assigned(kvm_tdx)) + return; + + if (kvm_tdx->td.tdcs_pages) { + for (i = 0; i < kvm_tdx->td.tdcs_nr_pages; i++) { + if (!kvm_tdx->td.tdcs_pages[i]) + continue; + + tdx_reclaim_control_page(kvm_tdx->td.tdcs_pages[i]); + } + kfree(kvm_tdx->td.tdcs_pages); + kvm_tdx->td.tdcs_pages = NULL; + } + + if (!kvm_tdx->td.tdr_page) + return; + + if (__tdx_reclaim_page(kvm_tdx->td.tdr_page)) + return; + + /* + * Use a SEAMCALL to ask the TDX module to flush the cache based on the + * KeyID. TDX module may access TDR while operating on TD (Especially + * when it is reclaiming TDCS). + */ + err = tdh_phymem_page_wbinvd_tdr(&kvm_tdx->td); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err); + return; + } + tdx_clear_page(kvm_tdx->td.tdr_page); + + __free_page(kvm_tdx->td.tdr_page); + kvm_tdx->td.tdr_page = NULL; +} + +void tdx_vm_free(struct kvm *kvm) +{ + tdx_reclaim_td_control_pages(kvm); +} + +static int tdx_do_tdh_mng_key_config(void *param) +{ + struct kvm_tdx *kvm_tdx = param; + u64 err; + + /* TDX_RND_NO_ENTROPY related retries are handled by sc_retry() */ + err = tdh_mng_key_config(&kvm_tdx->td); + + if (KVM_BUG_ON(err, &kvm_tdx->kvm)) { + pr_tdx_error(TDH_MNG_KEY_CONFIG, err); + return -EIO; + } + + return 0; +} + +static int __tdx_td_init(struct kvm *kvm); + +int tdx_vm_init(struct kvm *kvm) +{ + kvm->arch.has_private_mem = true; + + /* Place holder for TDX specific logic. */ + return __tdx_td_init(kvm); +} + static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd) { const struct tdx_sys_info_td_conf *td_conf = &tdx_sysinfo->td_conf; @@ -168,6 +442,177 @@ static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd) return ret; } +static int __tdx_td_init(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + cpumask_var_t packages; + struct page **tdcs_pages = NULL; + struct page *tdr_page; + int ret, i; + u64 err; + + ret = tdx_guest_keyid_alloc(); + if (ret < 0) + return ret; + kvm_tdx->hkid = ret; + + ret = -ENOMEM; + + tdr_page = alloc_page(GFP_KERNEL); + if (!tdr_page) + goto free_hkid; + + kvm_tdx->td.tdcs_nr_pages = tdx_sysinfo->td_ctrl.tdcs_base_size / PAGE_SIZE; + tdcs_pages = kcalloc(kvm_tdx->td.tdcs_nr_pages, sizeof(*kvm_tdx->td.tdcs_pages), + GFP_KERNEL | __GFP_ZERO); + if (!tdcs_pages) + goto free_tdr; + + for (i = 0; i < kvm_tdx->td.tdcs_nr_pages; i++) { + tdcs_pages[i] = alloc_page(GFP_KERNEL); + if (!tdcs_pages[i]) + goto free_tdcs; + } + + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) + goto free_tdcs; + + cpus_read_lock(); + + /* + * Need at least one CPU of the package to be online in order to + * program all packages for host key id. Check it. + */ + for_each_present_cpu(i) + cpumask_set_cpu(topology_physical_package_id(i), packages); + for_each_online_cpu(i) + cpumask_clear_cpu(topology_physical_package_id(i), packages); + if (!cpumask_empty(packages)) { + ret = -EIO; + /* + * Because it's hard for human operator to figure out the + * reason, warn it. + */ +#define MSG_ALLPKG "All packages need to have online CPU to create TD. Online CPU and retry.\n" + pr_warn_ratelimited(MSG_ALLPKG); + goto free_packages; + } + + /* + * TDH.MNG.CREATE tries to grab the global TDX module and fails + * with TDX_OPERAND_BUSY when it fails to grab. Take the global + * lock to prevent it from failure. + */ + mutex_lock(&tdx_lock); + kvm_tdx->td.tdr_page = tdr_page; + err = tdh_mng_create(&kvm_tdx->td, kvm_tdx->hkid); + mutex_unlock(&tdx_lock); + + if (err == TDX_RND_NO_ENTROPY) { + ret = -EAGAIN; + goto free_packages; + } + + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_CREATE, err); + ret = -EIO; + goto free_packages; + } + + for_each_online_cpu(i) { + int pkg = topology_physical_package_id(i); + + if (cpumask_test_and_set_cpu(pkg, packages)) + continue; + + /* + * Program the memory controller in the package with an + * encryption key associated to a TDX private host key id + * assigned to this TDR. Concurrent operations on same memory + * controller results in TDX_OPERAND_BUSY. No locking needed + * beyond the cpus_read_lock() above as it serializes against + * hotplug and the first online CPU of the package is always + * used. We never have two CPUs in the same socket trying to + * program the key. + */ + ret = smp_call_on_cpu(i, tdx_do_tdh_mng_key_config, + kvm_tdx, true); + if (ret) + break; + } + cpus_read_unlock(); + free_cpumask_var(packages); + if (ret) { + i = 0; + goto teardown; + } + + kvm_tdx->td.tdcs_pages = tdcs_pages; + for (i = 0; i < kvm_tdx->td.tdcs_nr_pages; i++) { + err = tdh_mng_addcx(&kvm_tdx->td, tdcs_pages[i]); + if (err == TDX_RND_NO_ENTROPY) { + /* Here it's hard to allow userspace to retry. */ + ret = -EAGAIN; + goto teardown; + } + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_ADDCX, err); + ret = -EIO; + goto teardown; + } + } + + /* + * Note, TDH_MNG_INIT cannot be invoked here. TDH_MNG_INIT requires a dedicated + * ioctl() to define the configure CPUID values for the TD. + */ + return 0; + + /* + * The sequence for freeing resources from a partially initialized TD + * varies based on where in the initialization flow failure occurred. + * Simply use the full teardown and destroy, which naturally play nice + * with partial initialization. + */ +teardown: + /* Only free pages not yet added, so start at 'i' */ + for (; i < kvm_tdx->td.tdcs_nr_pages; i++) { + if (tdcs_pages[i]) { + __free_page(tdcs_pages[i]); + tdcs_pages[i] = NULL; + } + } + if (!kvm_tdx->td.tdcs_pages) + kfree(tdcs_pages); + + tdx_mmu_release_hkid(kvm); + tdx_reclaim_td_control_pages(kvm); + + return ret; + +free_packages: + cpus_read_unlock(); + free_cpumask_var(packages); + +free_tdcs: + for (i = 0; i < kvm_tdx->td.tdcs_nr_pages; i++) { + if (tdcs_pages[i]) + __free_page(tdcs_pages[i]); + } + kfree(tdcs_pages); + kvm_tdx->td.tdcs_pages = NULL; + +free_tdr: + if (tdr_page) + __free_page(tdr_page); + kvm_tdx->td.tdr_page = 0; + +free_hkid: + tdx_hkid_free(kvm_tdx); + + return ret; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -322,6 +767,11 @@ int __init tdx_bringup(void) if (!enable_tdx) return 0; + if (!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)) { + pr_err("MOVDIR64B is reqiured for TDX\n"); + goto success_disable_tdx; + } + if (!kvm_can_support_tdx()) { pr_err("tdx: no TDX private KeyIDs available\n"); goto success_disable_tdx; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index abe77ceddb9f..8102461f775d 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -13,7 +13,9 @@ extern bool enable_tdx; struct kvm_tdx { struct kvm kvm; - /* TDX specific members follow. */ + int hkid; + + struct tdx_td td; }; struct vcpu_tdx { diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 2c87a8e7bddc..917473c6bf3c 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -122,8 +122,14 @@ void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu); void vmx_setup_mce(struct kvm_vcpu *vcpu); #ifdef CONFIG_INTEL_TDX_HOST +int tdx_vm_init(struct kvm *kvm); +void tdx_mmu_release_hkid(struct kvm *kvm); +void tdx_vm_free(struct kvm *kvm); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); #else +static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } +static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} +static inline void tdx_vm_free(struct kvm *kvm) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } #endif diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 374d89e6663f..e0b9b845df58 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12884,6 +12884,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_page_track_cleanup(kvm); kvm_xen_destroy_vm(kvm); kvm_hv_destroy_vm(kvm); + static_call_cond(kvm_x86_vm_free)(kvm); } static void memslot_rmap_free(struct kvm_memory_slot *slot) From patchwork Thu Feb 20 17:05:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984315 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6835621ADDE for ; Thu, 20 Feb 2025 17:06:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071204; cv=none; b=oQvYWj/C9q+gAoW+cBjlCe/7wVB0GlCL2EEMCaypei/uFEjvqn7a8qNyVR8TWdYkrfoguHPjk1SWpbF7rPWKeHCNmyOB+z7fC909Q4P4CxHJ3PfuEByhqI8XvAknpblQO49an0gw3UsKcL8W+Q02TG53F2cca/TpPggiQXys+0Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071204; c=relaxed/simple; bh=2Epm2gwPvyZu8b7MPaPN5NQLedSf+1+e/lOPLDginZg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CeWfZBld7ZLEfuDQFUWvDD0RUXSQcr2IQ9LtHa4IIwuvSyMWlbbokMhIinKNuv5QQwyI/uaKFG5AoKrywvxReKBV1db6jQVoSIAwtWMMGeDrIxz9bFGQ78RXxgB5lIxibTkPh2I0W5Zo43nH2/+GyYieXGQRioeFQ8EbUOn5yHg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Ij55wMvP; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ij55wMvP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071201; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3H4yjBWm2WqFP+poiFsbJvKniDl6yxMv7y+gQJGXGlI=; b=Ij55wMvPAT0VhfhOxeOy2rFOCqKUD87fCkT8g9UHWdd3QaeFIKrkT0BOhbFVNTW91dRx0Y U1/RaG+XWJ3dz12SHJFtXlcY79deFLP3Ya2C2fBPGi4H8m0TVwgiXX6cFjcR67z7bXQOZl 5e7xBRoU2SqiEBRPnbBBbd9mmbEg1EM= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-450-Al_Nnx4xPoKJW3a0go5eRw-1; Thu, 20 Feb 2025 12:06:40 -0500 X-MC-Unique: Al_Nnx4xPoKJW3a0go5eRw-1 X-Mimecast-MFC-AGG-ID: Al_Nnx4xPoKJW3a0go5eRw_1740071198 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8FF011800876; Thu, 20 Feb 2025 17:06:38 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9565519412A3; Thu, 20 Feb 2025 17:06:37 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata Subject: [PATCH 21/30] KVM: TDX: Support per-VM KVM_CAP_MAX_VCPUS extension check Date: Thu, 20 Feb 2025 12:05:55 -0500 Message-ID: <20250220170604.2279312-22-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Isaku Yamahata Change to report the KVM_CAP_MAX_VCPUS extension from globally to per-VM to allow userspace to be able to query maximum vCPUs for TDX guest via checking the KVM_CAP_MAX_VCPU extension on per-VM basis. Today KVM x86 reports KVM_MAX_VCPUS as guest's maximum vCPUs for all guests globally, and userspace, i.e. Qemu, queries the KVM_MAX_VCPUS extension globally but not on per-VM basis. TDX has its own limit of maximum vCPUs it can support for all TDX guests in addition to KVM_MAX_VCPUS. TDX module reports this limit via the MAX_VCPU_PER_TD global metadata. Different modules may report different values. In practice, the reported value reflects the maximum logical CPUs that ALL the platforms that the module supports can possibly have. Note some old modules may also not support this metadata, in which case the limit is U16_MAX. The current way to always report KVM_MAX_VCPUS in the KVM_CAP_MAX_VCPUS extension is not enough for TDX. To accommodate TDX, change to report the KVM_CAP_MAX_VCPUS extension on per-VM basis. Specifically, override kvm->max_vcpus in tdx_vm_init() for TDX guest, and report kvm->max_vcpus in the KVM_CAP_MAX_VCPUS extension check. Change to report "the number of logical CPUs the platform has" as the maximum vCPUs for TDX guest. Simply forwarding the MAX_VCPU_PER_TD reported by the TDX module would result in an unpredictable ABI because the reported value to userspace would be depending on whims of TDX modules. This works in practice because of the MAX_VCPU_PER_TD reported by the TDX module will never be smaller than the one reported to userspace. But to make sure KVM never reports an unsupported value, sanity check the MAX_VCPU_PER_TD reported by TDX module is not smaller than the number of logical CPUs the platform has, otherwise refuse to use TDX. Note, when creating a TDX guest, TDX actually requires the "maximum vCPUs for _this_ TDX guest" as an input to initialize the TDX guest. But TDX guest's maximum vCPUs is not part of TDREPORT thus not part of attestation, thus there's no need to allow userspace to explicitly _configure_ the maximum vCPUs on per-VM basis. KVM will simply use kvm->max_vcpus as input when initializing the TDX guest. Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/tdx.c | 51 +++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 2 ++ 3 files changed, 54 insertions(+) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index b655a5051c13..83f0d4d34a4a 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -7,6 +7,7 @@ #include "pmu.h" #include "posted_intr.h" #include "tdx.h" +#include "tdx_arch.h" static __init int vt_hardware_setup(void) { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index c270e2109d7d..60c26600bf18 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -390,6 +390,19 @@ int tdx_vm_init(struct kvm *kvm) { kvm->arch.has_private_mem = true; + /* + * TDX has its own limit of maximum vCPUs it can support for all + * TDX guests in addition to KVM_MAX_VCPUS. TDX module reports + * such limit via the MAX_VCPU_PER_TD global metadata. In + * practice, it reflects the number of logical CPUs that ALL + * platforms that the TDX module supports can possibly have. + * + * Limit TDX guest's maximum vCPUs to the number of logical CPUs + * the platform has. Simply forwarding the MAX_VCPU_PER_TD to + * userspace would result in an unpredictable ABI. + */ + kvm->max_vcpus = min_t(int, kvm->max_vcpus, num_present_cpus()); + /* Place holder for TDX specific logic. */ return __tdx_td_init(kvm); } @@ -707,6 +720,7 @@ static bool __init kvm_can_support_tdx(void) static int __init __tdx_bringup(void) { + const struct tdx_sys_info_td_conf *td_conf; int r; /* @@ -739,6 +753,43 @@ static int __init __tdx_bringup(void) if (!(tdx_sysinfo->features.tdx_features0 & MD_FIELD_ID_FEATURES0_TOPOLOGY_ENUM)) goto get_sysinfo_err; + /* + * TDX has its own limit of maximum vCPUs it can support for all + * TDX guests in addition to KVM_MAX_VCPUS. Userspace needs to + * query TDX guest's maximum vCPUs by checking KVM_CAP_MAX_VCPU + * extension on per-VM basis. + * + * TDX module reports such limit via the MAX_VCPU_PER_TD global + * metadata. Different modules may report different values. + * Some old module may also not support this metadata (in which + * case this limit is U16_MAX). + * + * In practice, the reported value reflects the maximum logical + * CPUs that ALL the platforms that the module supports can + * possibly have. + * + * Simply forwarding the MAX_VCPU_PER_TD to userspace could + * result in an unpredictable ABI. KVM instead always advertise + * the number of logical CPUs the platform has as the maximum + * vCPUs for TDX guests. + * + * Make sure MAX_VCPU_PER_TD reported by TDX module is not + * smaller than the number of logical CPUs, otherwise KVM will + * report an unsupported value to userspace. + * + * Note, a platform with TDX enabled in the BIOS cannot support + * physical CPU hotplug, and TDX requires the BIOS has marked + * all logical CPUs in MADT table as enabled. Just use + * num_present_cpus() for the number of logical CPUs. + */ + td_conf = &tdx_sysinfo->td_conf; + if (td_conf->max_vcpus_per_td < num_present_cpus()) { + pr_err("Disable TDX: MAX_VCPU_PER_TD (%u) smaller than number of logical CPUs (%u).\n", + td_conf->max_vcpus_per_td, num_present_cpus()); + r = -EINVAL; + goto get_sysinfo_err; + } + /* * Leave hardware virtualization enabled after TDX is enabled * successfully. TDX CPU hotplug depends on this. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e0b9b845df58..007195d9ab0b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4708,6 +4708,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) break; case KVM_CAP_MAX_VCPUS: r = KVM_MAX_VCPUS; + if (kvm) + r = kvm->max_vcpus; break; case KVM_CAP_MAX_VCPU_ID: r = KVM_MAX_VCPU_IDS; From patchwork Thu Feb 20 17:05:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984316 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D45D21CA19 for ; Thu, 20 Feb 2025 17:06:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071206; cv=none; b=PZMzrTVoQcxv6aWk4K6EDumJWWiybb7BX+uftYpuzImy7Kk3GofvcFrWQBjIEip2lUMQe7fqTcddPbqL70O1Yv2oUbPbm9KwH4o2Ze+29OW6g/5mJZv03jAjMPxUzgrxLMTBjLNFVONsOz+TPg2TUc7hG8GRbZMQWkHsR1i0cu8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071206; c=relaxed/simple; bh=JTIueOdIZjLRIp8mvEGEeO1r+hV/rIoc0ekRdL3GToY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=A2wPJ5gGU9AQH81iIbX4MSBpNaiZ0M3lzB+zJNWVcMVY9f5mK3s5GZtzUBqpn1hj55DWJ3CEraC0HeLB/BGXq51VPE+3NND5Mn015S0pEix/StH1M/81fP4Je52SQn2Ib8L60bvgNx9/J2VBt2q1DgoZLBpQH8vfOG1au9XXwU0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UjssQ1C3; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UjssQ1C3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071204; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4s21mY2ewISavR7XKbsSV9LgFnztnGJJFOSKeFooxvc=; b=UjssQ1C36EQFwQ6rDJijAPaYrcDkXSmBxwsM/R4Z736Ui8emT0/qzhdHLiLgW5vDI0b8F8 roTb3IXvcKcukVBTBDbq4u4a+6/F5z613BcArPCzmzfLCW3ZcrGd/CDce57C/frlxVNkBu xQPoUskzAmGso6j1cKct7c0NSiDi2/g= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-577-6ArPZJClOsubQxR5qp_dfA-1; Thu, 20 Feb 2025 12:06:41 -0500 X-MC-Unique: 6ArPZJClOsubQxR5qp_dfA-1 X-Mimecast-MFC-AGG-ID: 6ArPZJClOsubQxR5qp_dfA_1740071199 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A4631180087D; Thu, 20 Feb 2025 17:06:39 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C543E19412A4; Thu, 20 Feb 2025 17:06:38 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe Subject: [PATCH 22/30] KVM: x86: expose cpuid_entry2_find for TDX Date: Thu, 20 Feb 2025 12:05:56 -0500 Message-ID: <20250220170604.2279312-23-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 Signed-off-by: Paolo Bonzini --- arch/x86/kvm/cpuid.c | 19 +++++++++++-------- arch/x86/kvm/cpuid.h | 2 ++ 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 8eb3a88707f2..936c5dd13bd5 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -86,13 +86,13 @@ u32 xstate_required_size(u64 xstate_bv, bool compacted) * Magic value used by KVM when querying userspace-provided CPUID entries and * doesn't care about the CPIUD index because the index of the function in * question is not significant. Note, this magic value must have at least one - * bit set in bits[63:32] and must be consumed as a u64 by cpuid_entry2_find() + * bit set in bits[63:32] and must be consumed as a u64 by kvm_find_cpuid_entry2() * to avoid false positives when processing guest CPUID input. */ #define KVM_CPUID_INDEX_NOT_SIGNIFICANT -1ull -static struct kvm_cpuid_entry2 *cpuid_entry2_find(struct kvm_vcpu *vcpu, - u32 function, u64 index) +struct kvm_cpuid_entry2 *kvm_find_cpuid_entry2( + struct kvm_cpuid_entry2 *entries, int nent, u32 function, u64 index) { struct kvm_cpuid_entry2 *e; int i; @@ -109,8 +109,8 @@ static struct kvm_cpuid_entry2 *cpuid_entry2_find(struct kvm_vcpu *vcpu, */ lockdep_assert_irqs_enabled(); - for (i = 0; i < vcpu->arch.cpuid_nent; i++) { - e = &vcpu->arch.cpuid_entries[i]; + for (i = 0; i < nent; i++) { + e = &entries[i]; if (e->function != function) continue; @@ -141,23 +141,26 @@ static struct kvm_cpuid_entry2 *cpuid_entry2_find(struct kvm_vcpu *vcpu, return NULL; } +EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry2); struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu, u32 function, u32 index) { - return cpuid_entry2_find(vcpu, function, index); + return kvm_find_cpuid_entry2(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent, + function, index); } EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry_index); struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu, u32 function) { - return cpuid_entry2_find(vcpu, function, KVM_CPUID_INDEX_NOT_SIGNIFICANT); + return kvm_find_cpuid_entry2(vcpu->arch.cpuid_entries, vcpu->arch.cpuid_nent, + function, KVM_CPUID_INDEX_NOT_SIGNIFICANT); } EXPORT_SYMBOL_GPL(kvm_find_cpuid_entry); /* - * cpuid_entry2_find() and KVM_CPUID_INDEX_NOT_SIGNIFICANT should never be used + * kvm_find_cpuid_entry2() and KVM_CPUID_INDEX_NOT_SIGNIFICANT should never be used * directly outside of kvm_find_cpuid_entry() and kvm_find_cpuid_entry_index(). */ #undef KVM_CPUID_INDEX_NOT_SIGNIFICANT diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index 67d80aa72d50..91bde94519f9 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -12,6 +12,8 @@ void kvm_set_cpu_caps(void); void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu); void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu); +struct kvm_cpuid_entry2 *kvm_find_cpuid_entry2(struct kvm_cpuid_entry2 *entries, + int nent, u32 function, u64 index); struct kvm_cpuid_entry2 *kvm_find_cpuid_entry_index(struct kvm_vcpu *vcpu, u32 function, u32 index); struct kvm_cpuid_entry2 *kvm_find_cpuid_entry(struct kvm_vcpu *vcpu, From patchwork Thu Feb 20 17:05:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984319 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09C0921C175 for ; Thu, 20 Feb 2025 17:06:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071209; cv=none; b=UiR/GmmwLbHGNHsyRA5GBSJH17yOgUYIguzQRLMYu6AdwSuhBx9IZYklPNO/0WOux5hVB0YKCU5tmOPYUpCOdzW5vEDPPweHUp4k7uTlM7f589J/71Ap3bEc0KGpqyTA42zfLunBXJp6mSWGrMZGM5+eQKJEvY8mMbx8aWxO50w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071209; c=relaxed/simple; bh=UCUf0WyaxvAozDgNJlMmXYQQR4qcCYaO/4RPAwrQSfg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=fOZLtxytZfSiSViB+Oz/Dr1m6XyoPzAjhECndsCcRb9zbvksvI5F+b3zJObqQpbEUvOGRUjiAhMnDbrOr9kd8uAXbiXa339vw4/W+IIB6SdnUFyfDBA2xFxbJwrtgzV+RAVSvtRYURxsx+SD+Ec26RkMgGd8LPyVAkZCM/CINGU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=NAr5MP5d; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NAr5MP5d" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071206; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+pkrWwiFaHZ0PGvjtFhAWrOFEruuB86cN9cT+7PSgjU=; b=NAr5MP5d1ZI2LuEkdTE4meBkCCBbTs/Ub8TEzuWl9/Stxvq6TQgCczT38PqQg6w6R/nU3i RrEmB64zGfsEHWIj3hX5+6BE0OMMA1xz5izfMlVcfe+1+7dN+s20jwM4pd0KdCW+ByDDZ6 mJ0ihjieOvUK+3kuaFoIHD5br2xjT0w= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-10-wSHAWirlNWehX0sLT43a6Q-1; Thu, 20 Feb 2025 12:06:42 -0500 X-MC-Unique: wSHAWirlNWehX0sLT43a6Q-1 X-Mimecast-MFC-AGG-ID: wSHAWirlNWehX0sLT43a6Q_1740071201 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F2BCE190F9C9; Thu, 20 Feb 2025 17:06:40 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DA5D119412A3; Thu, 20 Feb 2025 17:06:39 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Xiaoyao Li Subject: [PATCH 23/30] KVM: TDX: initialize VM with TDX specific parameters Date: Thu, 20 Feb 2025 12:05:57 -0500 Message-ID: <20250220170604.2279312-24-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Isaku Yamahata After the crypto-protection key has been configured, TDX requires a VM-scope initialization as a step of creating the TDX guest. This "per-VM" TDX initialization does the global configurations/features that the TDX guest can support, such as guest's CPUIDs (emulated by the TDX module), the maximum number of vcpus etc. This "per-VM" TDX initialization must be done before any "vcpu-scope" TDX initialization. To match this better, require the KVM_TDX_INIT_VM IOCTL() to be done before KVM creates any vcpus. Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe Signed-off-by: Paolo Bonzini --- arch/x86/include/uapi/asm/kvm.h | 24 +++ arch/x86/kvm/vmx/tdx.c | 258 ++++++++++++++++++++++++++++++-- arch/x86/kvm/vmx/tdx.h | 25 ++++ 3 files changed, 297 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 8a4633cdb247..b64351076f2a 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -930,6 +930,7 @@ struct kvm_hyperv_eventfd { /* Trust Domain eXtension sub-ioctl() commands. */ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, KVM_TDX_CMD_NR_MAX, }; @@ -961,4 +962,27 @@ struct kvm_tdx_capabilities { struct kvm_cpuid2 cpuid; }; +struct kvm_tdx_init_vm { + __u64 attributes; + __u64 xfam; + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha384 digest */ + + /* The total space for TD_PARAMS before the CPUIDs is 256 bytes */ + __u64 reserved[12]; + + /* + * Call KVM_TDX_INIT_VM before vcpu creation, thus before + * KVM_SET_CPUID2. + * This configuration supersedes KVM_SET_CPUID2s for VCPUs because the + * TDX module directly virtualizes those CPUIDs without VMM. The user + * space VMM, e.g. qemu, should make KVM_SET_CPUID2 consistent with + * those values. If it doesn't, KVM may have wrong idea of vCPUIDs of + * the guest, and KVM may wrongly emulate CPUIDs or MSRs that the TDX + * module doesn't virtualize. + */ + struct kvm_cpuid2 cpuid; +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 60c26600bf18..ec8864453787 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -69,6 +69,11 @@ static u64 tdx_get_supported_xfam(const struct tdx_sys_info_td_conf *td_conf) return val; } +static int tdx_get_guest_phys_addr_bits(const u32 eax) +{ + return (eax & GENMASK(23, 16)) >> 16; +} + static u32 tdx_set_guest_phys_addr_bits(const u32 eax, int addr_bits) { return (eax & ~GENMASK(23, 16)) | (addr_bits & 0xff) << 16; @@ -365,7 +370,11 @@ static void tdx_reclaim_td_control_pages(struct kvm *kvm) void tdx_vm_free(struct kvm *kvm) { + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + tdx_reclaim_td_control_pages(kvm); + + kvm_tdx->state = TD_STATE_UNINITIALIZED; } static int tdx_do_tdh_mng_key_config(void *param) @@ -384,10 +393,10 @@ static int tdx_do_tdh_mng_key_config(void *param) return 0; } -static int __tdx_td_init(struct kvm *kvm); - int tdx_vm_init(struct kvm *kvm) { + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + kvm->arch.has_private_mem = true; /* @@ -403,8 +412,9 @@ int tdx_vm_init(struct kvm *kvm) */ kvm->max_vcpus = min_t(int, kvm->max_vcpus, num_present_cpus()); - /* Place holder for TDX specific logic. */ - return __tdx_td_init(kvm); + kvm_tdx->state = TD_STATE_UNINITIALIZED; + + return 0; } static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd) @@ -455,15 +465,151 @@ static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd) return ret; } -static int __tdx_td_init(struct kvm *kvm) +/* + * KVM reports guest physical address in CPUID.0x800000008.EAX[23:16], which is + * similar to TDX's GPAW. Use this field as the interface for userspace to + * configure the GPAW and EPT level for TDs. + * + * Only values 48 and 52 are supported. Value 52 means GPAW-52 and EPT level + * 5, Value 48 means GPAW-48 and EPT level 4. For value 48, GPAW-48 is always + * supported. Value 52 is only supported when the platform supports 5 level + * EPT. + */ +static int setup_tdparams_eptp_controls(struct kvm_cpuid2 *cpuid, + struct td_params *td_params) +{ + const struct kvm_cpuid_entry2 *entry; + int guest_pa; + + entry = kvm_find_cpuid_entry2(cpuid->entries, cpuid->nent, 0x80000008, 0); + if (!entry) + return -EINVAL; + + guest_pa = tdx_get_guest_phys_addr_bits(entry->eax); + + if (guest_pa != 48 && guest_pa != 52) + return -EINVAL; + + if (guest_pa == 52 && !cpu_has_vmx_ept_5levels()) + return -EINVAL; + + td_params->eptp_controls = VMX_EPTP_MT_WB; + if (guest_pa == 52) { + td_params->eptp_controls |= VMX_EPTP_PWL_5; + td_params->config_flags |= TDX_CONFIG_FLAGS_MAX_GPAW; + } else { + td_params->eptp_controls |= VMX_EPTP_PWL_4; + } + + return 0; +} + +static int setup_tdparams_cpuids(struct kvm_cpuid2 *cpuid, + struct td_params *td_params) +{ + const struct tdx_sys_info_td_conf *td_conf = &tdx_sysinfo->td_conf; + const struct kvm_cpuid_entry2 *entry; + struct tdx_cpuid_value *value; + int i, copy_cnt = 0; + + /* + * td_params.cpuid_values: The number and the order of cpuid_value must + * be same to the one of struct tdsysinfo.{num_cpuid_config, cpuid_configs} + * It's assumed that td_params was zeroed. + */ + for (i = 0; i < td_conf->num_cpuid_config; i++) { + struct kvm_cpuid_entry2 tmp; + + td_init_cpuid_entry2(&tmp, i); + + entry = kvm_find_cpuid_entry2(cpuid->entries, cpuid->nent, + tmp.function, tmp.index); + if (!entry) + continue; + + copy_cnt++; + + value = &td_params->cpuid_values[i]; + value->eax = entry->eax; + value->ebx = entry->ebx; + value->ecx = entry->ecx; + value->edx = entry->edx; + + /* + * TDX module does not accept nonzero bits 16..23 for the + * CPUID[0x80000008].EAX, see setup_tdparams_eptp_controls(). + */ + if (tmp.function == 0x80000008) + value->eax = tdx_set_guest_phys_addr_bits(value->eax, 0); + } + + /* + * Rely on the TDX module to reject invalid configuration, but it can't + * check of leafs that don't have a proper slot in td_params->cpuid_values + * to stick then. So fail if there were entries that didn't get copied to + * td_params. + */ + if (copy_cnt != cpuid->nent) + return -EINVAL; + + return 0; +} + +static int setup_tdparams(struct kvm *kvm, struct td_params *td_params, + struct kvm_tdx_init_vm *init_vm) +{ + const struct tdx_sys_info_td_conf *td_conf = &tdx_sysinfo->td_conf; + struct kvm_cpuid2 *cpuid = &init_vm->cpuid; + int ret; + + if (kvm->created_vcpus) + return -EBUSY; + + if (init_vm->attributes & ~tdx_get_supported_attrs(td_conf)) + return -EINVAL; + + if (init_vm->xfam & ~tdx_get_supported_xfam(td_conf)) + return -EINVAL; + + td_params->max_vcpus = kvm->max_vcpus; + td_params->attributes = init_vm->attributes | td_conf->attributes_fixed1; + td_params->xfam = init_vm->xfam | td_conf->xfam_fixed1; + + td_params->config_flags = TDX_CONFIG_FLAGS_NO_RBP_MOD; + td_params->tsc_frequency = TDX_TSC_KHZ_TO_25MHZ(kvm->arch.default_tsc_khz); + + ret = setup_tdparams_eptp_controls(cpuid, td_params); + if (ret) + return ret; + + ret = setup_tdparams_cpuids(cpuid, td_params); + if (ret) + return ret; + +#define MEMCPY_SAME_SIZE(dst, src) \ + do { \ + BUILD_BUG_ON(sizeof(dst) != sizeof(src)); \ + memcpy((dst), (src), sizeof(dst)); \ + } while (0) + + MEMCPY_SAME_SIZE(td_params->mrconfigid, init_vm->mrconfigid); + MEMCPY_SAME_SIZE(td_params->mrowner, init_vm->mrowner); + MEMCPY_SAME_SIZE(td_params->mrownerconfig, init_vm->mrownerconfig); + + return 0; +} + +static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, + u64 *seamcall_err) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); cpumask_var_t packages; struct page **tdcs_pages = NULL; struct page *tdr_page; int ret, i; - u64 err; + u64 err, rcx; + *seamcall_err = 0; ret = tdx_guest_keyid_alloc(); if (ret < 0) return ret; @@ -575,10 +721,23 @@ static int __tdx_td_init(struct kvm *kvm) } } - /* - * Note, TDH_MNG_INIT cannot be invoked here. TDH_MNG_INIT requires a dedicated - * ioctl() to define the configure CPUID values for the TD. - */ + err = tdh_mng_init(&kvm_tdx->td, __pa(td_params), &rcx); + if ((err & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_INVALID) { + /* + * Because a user gives operands, don't warn. + * Return a hint to the user because it's sometimes hard for the + * user to figure out which operand is invalid. SEAMCALL status + * code includes which operand caused invalid operand error. + */ + *seamcall_err = err; + ret = -EINVAL; + goto teardown; + } else if (WARN_ON_ONCE(err)) { + pr_tdx_error_1(TDH_MNG_INIT, err, rcx); + ret = -EIO; + goto teardown; + } + return 0; /* @@ -626,6 +785,82 @@ static int __tdx_td_init(struct kvm *kvm) return ret; } +static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct kvm_tdx_init_vm *init_vm; + struct td_params *td_params = NULL; + int ret; + + BUILD_BUG_ON(sizeof(*init_vm) != 256 + sizeof_field(struct kvm_tdx_init_vm, cpuid)); + BUILD_BUG_ON(sizeof(struct td_params) != 1024); + + if (kvm_tdx->state != TD_STATE_UNINITIALIZED) + return -EINVAL; + + if (cmd->flags) + return -EINVAL; + + init_vm = kmalloc(sizeof(*init_vm) + + sizeof(init_vm->cpuid.entries[0]) * KVM_MAX_CPUID_ENTRIES, + GFP_KERNEL); + if (!init_vm) + return -ENOMEM; + + if (copy_from_user(init_vm, u64_to_user_ptr(cmd->data), sizeof(*init_vm))) { + ret = -EFAULT; + goto out; + } + + if (init_vm->cpuid.nent > KVM_MAX_CPUID_ENTRIES) { + ret = -E2BIG; + goto out; + } + + if (copy_from_user(init_vm->cpuid.entries, + u64_to_user_ptr(cmd->data) + sizeof(*init_vm), + flex_array_size(init_vm, cpuid.entries, init_vm->cpuid.nent))) { + ret = -EFAULT; + goto out; + } + + if (memchr_inv(init_vm->reserved, 0, sizeof(init_vm->reserved))) { + ret = -EINVAL; + goto out; + } + + if (init_vm->cpuid.padding) { + ret = -EINVAL; + goto out; + } + + td_params = kzalloc(sizeof(struct td_params), GFP_KERNEL); + if (!td_params) { + ret = -ENOMEM; + goto out; + } + + ret = setup_tdparams(kvm, td_params, init_vm); + if (ret) + goto out; + + ret = __tdx_td_init(kvm, td_params, &cmd->hw_error); + if (ret) + goto out; + + kvm_tdx->tsc_offset = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_OFFSET); + kvm_tdx->attributes = td_params->attributes; + kvm_tdx->xfam = td_params->xfam; + + kvm_tdx->state = TD_STATE_INITIALIZED; +out: + /* kfree() accepts NULL. */ + kfree(init_vm); + kfree(td_params); + + return ret; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -647,6 +882,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) case KVM_TDX_CAPABILITIES: r = tdx_get_capabilities(&tdx_cmd); break; + case KVM_TDX_INIT_VM: + r = tdx_td_init(kvm, &tdx_cmd); + break; default: r = -EINVAL; goto out; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 8102461f775d..f12854c8ff07 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -11,9 +11,23 @@ void tdx_cleanup(void); extern bool enable_tdx; +/* TDX module hardware states. These follow the TDX module OP_STATEs. */ +enum kvm_tdx_state { + TD_STATE_UNINITIALIZED = 0, + TD_STATE_INITIALIZED, + TD_STATE_RUNNABLE, +}; + struct kvm_tdx { struct kvm kvm; + int hkid; + enum kvm_tdx_state state; + + u64 attributes; + u64 xfam; + + u64 tsc_offset; struct tdx_td td; }; @@ -33,6 +47,17 @@ static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) return is_td(vcpu->kvm); } +static __always_inline u64 td_tdcs_exec_read64(struct kvm_tdx *kvm_tdx, u32 field) +{ + u64 err, data; + + err = tdh_mng_rd(&kvm_tdx->td, TDCS_EXEC(field), &data); + if (unlikely(err)) { + pr_err("TDH_MNG_RD[EXEC.0x%x] failed: 0x%llx\n", field, err); + return 0; + } + return data; +} #else static inline int tdx_bringup(void) { return 0; } static inline void tdx_cleanup(void) {} From patchwork Thu Feb 20 17:05:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984321 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C37A9222582 for ; Thu, 20 Feb 2025 17:06:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071213; cv=none; b=A9bB9CylgL/sn0uyFhHxLYSG8h449bz3RBViBQBpNhGVhNVvKcpod3ZhNL16J4LKE8ZCxWx8nvpoTbWF15JNyat4UdXXmRWdgPdfIJc2Crxsc1hggXOOv6RadROP16jedh6YRMCp62w7WEY9Oasn3Knwh538cjpERfnRl3h8VSs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071213; c=relaxed/simple; bh=7qRFF3KjBkZ/U8ObQQMR4Tcp+Qr8tUOdKL/8XILRiqM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=lJeCI9xpye+zDq38Hz1JFsQ/2U2onfu2vCSCyKEPaBvGBLPjaXx9whZxJ2XGN5pvP/YTIm7JlJgMehgsHpuPRzdoNCJ/jvR6N7dVKcAUIYmtZSLvKTlU6ql/H4KilFU392oNQIs1fgDVDl24pA4/a5heh3XoCcf/bzecDDZTOJE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=hVmxm/c9; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hVmxm/c9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071209; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dd5Ky/N9ingEVckn7njy5YI5GcNFlAhwwbSFuVZ3CoM=; b=hVmxm/c918sHVCaBr7YXK4jEV98ByH7uF/t/78QbZ633nbecel/eMeLHvOjc/VYdx8SPYj NVsrRTxRGyq5Zf/FM3e/hyOpGipkNhUz8Ioms0+BrEurfN9q/76AX2ABu9zxT74lpiOf38 ekLWEDNRklntU9GDCHDQsDg2bo7XvDo= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-qTpmE3YdMMq2fRtEjwIhUA-1; Thu, 20 Feb 2025 12:06:43 -0500 X-MC-Unique: qTpmE3YdMMq2fRtEjwIhUA-1 X-Mimecast-MFC-AGG-ID: qTpmE3YdMMq2fRtEjwIhUA_1740071202 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4D10319039C6; Thu, 20 Feb 2025 17:06:42 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 344FF19412A6; Thu, 20 Feb 2025 17:06:41 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Tony Lindgren Subject: [PATCH 24/30] KVM: TDX: Make pmu_intel.c ignore guest TD case Date: Thu, 20 Feb 2025 12:05:58 -0500 Message-ID: <20250220170604.2279312-25-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Isaku Yamahata TDX KVM doesn't support PMU yet, it's future work of TDX KVM support as another patch series. For now, handle TDX by updating vcpu_to_lbr_desc() and vcpu_to_lbr_records() to return NULL. Signed-off-by: Isaku Yamahata Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Signed-off-by: Rick Edgecombe --- - Add pragma poison for to_vmx() (Paolo) Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/pmu_intel.c | 52 +++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/pmu_intel.h | 28 +++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 34 +---------------------- 3 files changed, 80 insertions(+), 34 deletions(-) create mode 100644 arch/x86/kvm/vmx/pmu_intel.h diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 77012b2eca0e..8a94b52c5731 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -19,6 +19,7 @@ #include "lapic.h" #include "nested.h" #include "pmu.h" +#include "tdx.h" /* * Perf's "BASE" is wildly misleading, architectural PMUs use bits 31:16 of ECX @@ -34,6 +35,24 @@ #define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0) +static struct lbr_desc *vcpu_to_lbr_desc(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return NULL; + + return &to_vmx(vcpu)->lbr_desc; +} + +static struct x86_pmu_lbr *vcpu_to_lbr_records(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return NULL; + + return &to_vmx(vcpu)->lbr_desc.records; +} + +#pragma GCC poison to_vmx + static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data) { struct kvm_pmc *pmc; @@ -129,6 +148,22 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr) return get_gp_pmc(pmu, msr, MSR_IA32_PMC0); } +static bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return false; + + return cpuid_model_is_consistent(vcpu); +} + +bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return false; + + return !!vcpu_to_lbr_records(vcpu)->nr; +} + static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index) { struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu); @@ -194,6 +229,9 @@ static inline void intel_pmu_release_guest_lbr_event(struct kvm_vcpu *vcpu) { struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu); + if (!lbr_desc) + return; + if (lbr_desc->event) { perf_event_release_kernel(lbr_desc->event); lbr_desc->event = NULL; @@ -235,6 +273,9 @@ int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu) PERF_SAMPLE_BRANCH_USER, }; + if (WARN_ON_ONCE(!lbr_desc)) + return 0; + if (unlikely(lbr_desc->event)) { __set_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use); return 0; @@ -466,6 +507,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) u64 perf_capabilities; u64 counter_rsvd; + if (!lbr_desc) + return; + memset(&lbr_desc->records, 0, sizeof(lbr_desc->records)); /* @@ -542,7 +586,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); perf_capabilities = vcpu_get_perf_capabilities(vcpu); - if (cpuid_model_is_consistent(vcpu) && + if (intel_pmu_lbr_is_compatible(vcpu) && (perf_capabilities & PMU_CAP_LBR_FMT)) memcpy(&lbr_desc->records, &vmx_lbr_caps, sizeof(vmx_lbr_caps)); else @@ -570,6 +614,9 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu); + if (!lbr_desc) + return; + for (i = 0; i < KVM_MAX_NR_INTEL_GP_COUNTERS; i++) { pmu->gp_counters[i].type = KVM_PMC_GP; pmu->gp_counters[i].vcpu = vcpu; @@ -677,6 +724,9 @@ void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu) struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu); + if (WARN_ON_ONCE(!lbr_desc)) + return; + if (!lbr_desc->event) { vmx_disable_lbr_msrs_passthrough(vcpu); if (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR) diff --git a/arch/x86/kvm/vmx/pmu_intel.h b/arch/x86/kvm/vmx/pmu_intel.h new file mode 100644 index 000000000000..5620d0882cdc --- /dev/null +++ b/arch/x86/kvm/vmx/pmu_intel.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __KVM_X86_VMX_PMU_INTEL_H +#define __KVM_X86_VMX_PMU_INTEL_H + +#include + +bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu); +int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); + +struct lbr_desc { + /* Basic info about guest LBR records. */ + struct x86_pmu_lbr records; + + /* + * Emulate LBR feature via passthrough LBR registers when the + * per-vcpu guest LBR event is scheduled on the current pcpu. + * + * The records may be inaccurate if the host reclaims the LBR. + */ + struct perf_event *event; + + /* True if LBRs are marked as not intercepted in the MSR bitmap */ + bool msr_passthrough; +}; + +extern struct x86_pmu_lbr vmx_lbr_caps; + +#endif /* __KVM_X86_VMX_PMU_INTEL_H */ diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 299fa1edf534..a58b940f0634 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -11,6 +11,7 @@ #include "capabilities.h" #include "../kvm_cache_regs.h" +#include "pmu_intel.h" #include "vmcs.h" #include "vmx_ops.h" #include "../cpuid.h" @@ -90,24 +91,6 @@ union vmx_exit_reason { u32 full; }; -struct lbr_desc { - /* Basic info about guest LBR records. */ - struct x86_pmu_lbr records; - - /* - * Emulate LBR feature via passthrough LBR registers when the - * per-vcpu guest LBR event is scheduled on the current pcpu. - * - * The records may be inaccurate if the host reclaims the LBR. - */ - struct perf_event *event; - - /* True if LBRs are marked as not intercepted in the MSR bitmap */ - bool msr_passthrough; -}; - -extern struct x86_pmu_lbr vmx_lbr_caps; - /* * The nested_vmx structure is part of vcpu_vmx, and holds information we need * for correct emulation of VMX (i.e., nested VMX) on this vcpu. @@ -664,21 +647,6 @@ static __always_inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_vmx, vcpu); } -static inline struct lbr_desc *vcpu_to_lbr_desc(struct kvm_vcpu *vcpu) -{ - return &to_vmx(vcpu)->lbr_desc; -} - -static inline struct x86_pmu_lbr *vcpu_to_lbr_records(struct kvm_vcpu *vcpu) -{ - return &vcpu_to_lbr_desc(vcpu)->records; -} - -static inline bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) -{ - return !!vcpu_to_lbr_records(vcpu)->nr; -} - void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu); int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu); From patchwork Thu Feb 20 17:05:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984323 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8647223708 for ; Thu, 20 Feb 2025 17:06:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071215; cv=none; b=TdEG2UL4dKBvQLsMA/U2Jbl9OJOGaHktoZSXGYty8sh0QL3nrealPmZ0wDIhDlq7oKD9ySLHARggvIW/C/mWW8cf7fUMHgQAsVhUe1t5hss2TVezWH5U7A1sgRg4YOdDj4Iy8E3D67IMuip5tzUsVAd4crmeK5ctOMQTX6bPbdA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071215; c=relaxed/simple; bh=aOe0kBWT/SMLYV/wZaMc0z+GF6EZcZUqsRf5Y3fJ0NM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=X2cHcJvgxWaI110RaNyuvZPuyOQJVk5SYz9bZNIpN8Ap5+yQk6MZPhcMBJIf7Seb+7q4jrICfDb8QRPdt0uErW+uFnR0kgLWlzG9uF/18lCEx0iL0KVa9FxNiBLlrC50gI91Lurd1yLKGmkR9VZFnZb25B697PLCtxML2Nwl49w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Otg+QY53; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Otg+QY53" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071211; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OK4+tjkqFGEwtmhx8CZ6Jsa93KLzas0PPqSsxfoBC1I=; b=Otg+QY53Hbfr9d8+EopC9yohtU+MQwaKX06khm4loDDayvpv1uR3fuf3APqbWgfBHNzGBL IPUxNB0T887EUzqn52YU11KIB0ckzFmpKFAHyQxqicUbo9DWSspapW2LqTjdeHplGDvAzJ 2lu5ppFP7XpxiXSm0/eiUYCmnftPxqA= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-453-H9-SsvULO2WCcGZVV25XkQ-1; Thu, 20 Feb 2025 12:06:45 -0500 X-MC-Unique: H9-SsvULO2WCcGZVV25XkQ-1 X-Mimecast-MFC-AGG-ID: H9-SsvULO2WCcGZVV25XkQ_1740071203 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B79591800570; Thu, 20 Feb 2025 17:06:43 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 8263C19412A3; Thu, 20 Feb 2025 17:06:42 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Kai Huang , Binbin Wu Subject: [PATCH 25/30] KVM: TDX: Don't offline the last cpu of one package when there's TDX guest Date: Thu, 20 Feb 2025 12:05:59 -0500 Message-ID: <20250220170604.2279312-26-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Isaku Yamahata Destroying TDX guest requires there's at least one cpu online for each package, because reclaiming the TDX KeyID of the guest (as part of the teardown process) requires to call some SEAMCALL (on any cpu) on all packages. Do not offline the last cpu of one package when there's any TDX guest running, otherwise KVM may not be able to teardown TDX guest resulting in leaking of TDX KeyID and other resources like TDX guest control structure pages. Implement the TDX version 'offline_cpu()' to prevent the cpu from going offline if it is the last cpu on the package. Co-developed-by: Kai Huang Signed-off-by: Kai Huang Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe Reviewed-by: Binbin Wu Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 43 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index ec8864453787..f794cd914050 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -131,6 +131,8 @@ static int init_kvm_tdx_caps(const struct tdx_sys_info_td_conf *td_conf, */ static DEFINE_MUTEX(tdx_lock); +static atomic_t nr_configured_hkid; + /* Maximum number of retries to attempt for SEAMCALLs. */ #define TDX_SEAMCALL_RETRIES 10000 @@ -138,6 +140,7 @@ static inline void tdx_hkid_free(struct kvm_tdx *kvm_tdx) { tdx_guest_keyid_free(kvm_tdx->hkid); kvm_tdx->hkid = -1; + atomic_dec(&nr_configured_hkid); } static inline bool is_hkid_assigned(struct kvm_tdx *kvm_tdx) @@ -617,6 +620,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, ret = -ENOMEM; + atomic_inc(&nr_configured_hkid); + tdr_page = alloc_page(GFP_KERNEL); if (!tdr_page) goto free_hkid; @@ -913,6 +918,42 @@ static int tdx_online_cpu(unsigned int cpu) return r; } +static int tdx_offline_cpu(unsigned int cpu) +{ + int i; + + /* No TD is running. Allow any cpu to be offline. */ + if (!atomic_read(&nr_configured_hkid)) + return 0; + + /* + * In order to reclaim TDX HKID, (i.e. when deleting guest TD), need to + * call TDH.PHYMEM.PAGE.WBINVD on all packages to program all memory + * controller with pconfig. If we have active TDX HKID, refuse to + * offline the last online cpu. + */ + for_each_online_cpu(i) { + /* + * Found another online cpu on the same package. + * Allow to offline. + */ + if (i != cpu && topology_physical_package_id(i) == + topology_physical_package_id(cpu)) + return 0; + } + + /* + * This is the last cpu of this package. Don't offline it. + * + * Because it's hard for human operator to understand the + * reason, warn it. + */ +#define MSG_ALLPKG_ONLINE \ + "TDX requires all packages to have an online CPU. Delete all TDs in order to offline all CPUs of a package.\n" + pr_warn_ratelimited(MSG_ALLPKG_ONLINE); + return -EBUSY; +} + static void __do_tdx_cleanup(void) { /* @@ -938,7 +979,7 @@ static int __init __do_tdx_bringup(void) */ r = cpuhp_setup_state_cpuslocked(CPUHP_AP_ONLINE_DYN, "kvm/cpu/tdx:online", - tdx_online_cpu, NULL); + tdx_online_cpu, tdx_offline_cpu); if (r < 0) return r; From patchwork Thu Feb 20 17:06:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984322 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F5F7222589 for ; Thu, 20 Feb 2025 17:06:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071213; cv=none; b=pL9191UKOHWvxw+YbHJZUcpfoNTZQiRWwRtzg30icCSubA7EyRhAjSjP9vHz5upnhxz357ryv8gUDvEgRpFXBdr6DKnDkD1UiJMm5ZITMJzdlDTBEqCrhPbrQxR6fjdOv3upgI3DE2X0edElnpFmrrkxt0Zk9Fb6koXf8ciNcyY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071213; c=relaxed/simple; bh=RrNygcpIAuAzeObQdp+KD0lcfXt/GLIRnw0/L7SoIOY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=duXMPqhuWMQsWiTpJ4nldNjNAVrJyWQcImgBudrUnkVfQEeKSU8JRa+AQiEYQZSWSdPeIgqd7R4WXfvx+5PLp9jPn8pH7J5qaGy2cXUOAHR8Ib6OqxtV2rc3Egt31qoozEocZoguL3bB66d5pljQV/wBtn87TC3V1pJSEecQqTg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EeWQxbwd; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EeWQxbwd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071210; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L1fUPJ+H1OTlS8BptC5w8m7YAt+/dSD1KgBgoTJ7MWc=; b=EeWQxbwdAm0xuvMPHa1ccgUW/C7QGFpRdLttz1XARu/NNdeie4kYIh70C6gQjmxGp+bmUq zimVzGno2xW3zIy3ZVWdbVfrPbQG/P0f9xV4qKMFdRq+RtcH9vaBZZ88MkLHam4aJ0T0Ur 7qyN6S3e9IFhJTn85QUourES4Un5s4k= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-92-0tjMAvXYN-i0tlfN556epQ-1; Thu, 20 Feb 2025 12:06:46 -0500 X-MC-Unique: 0tjMAvXYN-i0tlfN556epQ-1 X-Mimecast-MFC-AGG-ID: 0tjMAvXYN-i0tlfN556epQ_1740071205 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E7C0D1800264; Thu, 20 Feb 2025 17:06:44 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id ECD6119412A3; Thu, 20 Feb 2025 17:06:43 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata Subject: [PATCH 26/30] KVM: TDX: create/free TDX vcpu structure Date: Thu, 20 Feb 2025 12:06:00 -0500 Message-ID: <20250220170604.2279312-27-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Isaku Yamahata Implement vcpu related stubs for TDX for create, reset and free. For now, create only the features that do not require the TDX SEAMCALL. The TDX specific vcpu initialization will be handled by KVM_TDX_INIT_VCPU. Signed-off-by: Isaku Yamahata Signed-off-by: Rick Edgecombe --- - Use lapic_in_kernel() (Nikolay Borisov) Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 42 ++++++++++++++++++++++++++++++++++---- arch/x86/kvm/vmx/tdx.c | 35 +++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 8 ++++++++ 3 files changed, 81 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 83f0d4d34a4a..d06f9f0519bd 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -64,6 +64,40 @@ static void vt_vm_free(struct kvm *kvm) tdx_vm_free(kvm); } +static int vt_vcpu_precreate(struct kvm *kvm) +{ + if (is_td(kvm)) + return 0; + + return vmx_vcpu_precreate(kvm); +} + +static int vt_vcpu_create(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_create(vcpu); + + return vmx_vcpu_create(vcpu); +} + +static void vt_vcpu_free(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) { + tdx_vcpu_free(vcpu); + return; + } + + vmx_vcpu_free(vcpu); +} + +static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_vcpu_reset(vcpu, init_event); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -100,10 +134,10 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vm_destroy = vt_vm_destroy, .vm_free = vt_vm_free, - .vcpu_precreate = vmx_vcpu_precreate, - .vcpu_create = vmx_vcpu_create, - .vcpu_free = vmx_vcpu_free, - .vcpu_reset = vmx_vcpu_reset, + .vcpu_precreate = vt_vcpu_precreate, + .vcpu_create = vt_vcpu_create, + .vcpu_free = vt_vcpu_free, + .vcpu_reset = vt_vcpu_reset, .prepare_switch_to_guest = vmx_prepare_switch_to_guest, .vcpu_load = vmx_vcpu_load, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index f794cd914050..4ab5c994d877 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -4,6 +4,7 @@ #include #include "capabilities.h" #include "x86_ops.h" +#include "lapic.h" #include "tdx.h" #pragma GCC poison to_vmx @@ -420,6 +421,40 @@ int tdx_vm_init(struct kvm *kvm) return 0; } +int tdx_vcpu_create(struct kvm_vcpu *vcpu) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + + if (kvm_tdx->state != TD_STATE_INITIALIZED) + return -EIO; + + /* TDX module mandates APICv, which requires an in-kernel local APIC. */ + if (!lapic_in_kernel(vcpu)) + return -EINVAL; + + fpstate_set_confidential(&vcpu->arch.guest_fpu); + + vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; + + vcpu->arch.cr0_guest_owned_bits = -1ul; + vcpu->arch.cr4_guest_owned_bits = -1ul; + + vcpu->arch.tsc_offset = kvm_tdx->tsc_offset; + vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset; + vcpu->arch.guest_state_protected = + !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTR_DEBUG); + + if ((kvm_tdx->xfam & XFEATURE_MASK_XTILE) == XFEATURE_MASK_XTILE) + vcpu->arch.xfd_no_write_intercept = true; + + return 0; +} + +void tdx_vcpu_free(struct kvm_vcpu *vcpu) +{ + /* This is stub for now. More logic will come. */ +} + static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd) { const struct tdx_sys_info_td_conf *td_conf = &tdx_sysinfo->td_conf; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 917473c6bf3c..6e1ff7d88b61 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -125,12 +125,20 @@ void vmx_setup_mce(struct kvm_vcpu *vcpu); int tdx_vm_init(struct kvm *kvm); void tdx_mmu_release_hkid(struct kvm *kvm); void tdx_vm_free(struct kvm *kvm); + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); + +int tdx_vcpu_create(struct kvm_vcpu *vcpu); +void tdx_vcpu_free(struct kvm_vcpu *vcpu); #else static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} static inline void tdx_vm_free(struct kvm *kvm) {} + static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } + +static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } +static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} #endif #endif /* __KVM_X86_VMX_X86_OPS_H */ From patchwork Thu Feb 20 17:06:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984324 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6DBCC2139D1 for ; Thu, 20 Feb 2025 17:06:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071216; cv=none; b=Dbv1GTf7iAfL5dBWweLHU9jaoMzMh6D9/YhjlEos4WIzhOM8//ZHmENVc3lG+g6XC7PueEqzvWgnNqR8sWqt+VginwktkkFL1tx5tC5Di7ru9XEkkMo67luQqL3kxuCrTw6aWB+SexNntZXWy5vOAmuWkSb+mJxdTN/xc0CpZNk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071216; c=relaxed/simple; bh=SRUHghBt0EjAXA2PPZey7S9tMi48R+sW2GBl1FWSsZ4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=eAUWLf1P9rSaAFgu03I/qKbcBa3AJcI2KtJJUIbKoozlq7F/E/aJ2CzWQ+4Bg18/5z/+eS/jxTOjulmMAgR0tQ485AYKGbarc9zxK1woEEMQKDenwTCWl2ym22uOc/LUsSv3iv5kH34wK/bBgfdQKpYruOUZysY8iD40oSnd42k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=YG4g6W6J; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YG4g6W6J" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071212; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FKThzOWBhSgB3n7KXEQU7sCM801172+qrbI8O9kVTnc=; b=YG4g6W6J8pAwebzW77kc938pdvD3iGrp2ciWCrFNJxhQtMeQ2c04alynY6URiY4PAEWPet jXOxgQr3keWOAf3vjjfQHw+YH1HqD+02WUVyv1Z4AA5fiQOLseZGHLu58kyjY4VJfwc1S3 Vj4bBVPHaDEd84vpUI91WgvPy9N/iyc= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-502-gasat3lHOJqHNxNPIJRoyA-1; Thu, 20 Feb 2025 12:06:48 -0500 X-MC-Unique: gasat3lHOJqHNxNPIJRoyA-1 X-Mimecast-MFC-AGG-ID: gasat3lHOJqHNxNPIJRoyA_1740071206 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7D74719030A0; Thu, 20 Feb 2025 17:06:46 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2929819412A3; Thu, 20 Feb 2025 17:06:45 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Isaku Yamahata , Sean Christopherson , Tony Lindgren , Adrian Hunter Subject: [PATCH 27/30] KVM: TDX: Do TDX specific vcpu initialization Date: Thu, 20 Feb 2025 12:06:01 -0500 Message-ID: <20250220170604.2279312-28-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Isaku Yamahata TD guest vcpu needs TDX specific initialization before running. Repurpose KVM_MEMORY_ENCRYPT_OP to vcpu-scope, add a new sub-command KVM_TDX_INIT_VCPU, and implement the callback for it. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Co-developed-by: Adrian Hunter Signed-off-by: Adrian Hunter Signed-off-by: Rick Edgecombe --- - Fix comment: https://lore.kernel.org/kvm/Z36OYfRW9oPjW8be@google.com/ (Sean) Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/lapic.c | 1 + arch/x86/kvm/vmx/main.c | 9 ++ arch/x86/kvm/vmx/tdx.c | 172 ++++++++++++++++++++++++++++- arch/x86/kvm/vmx/tdx.h | 11 +- arch/x86/kvm/vmx/x86_ops.h | 4 + arch/x86/kvm/x86.c | 7 ++ 9 files changed, 205 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index bb909e6125b4..e9e4cecd6897 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -127,6 +127,7 @@ KVM_X86_OP(enable_smi_window) #endif KVM_X86_OP_OPTIONAL(dev_get_attr) KVM_X86_OP(mem_enc_ioctl) +KVM_X86_OP_OPTIONAL(vcpu_mem_enc_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_register_region) KVM_X86_OP_OPTIONAL(mem_enc_unregister_region) KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 831b20830a09..1d3d0dfcc572 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1847,6 +1847,7 @@ struct kvm_x86_ops { int (*dev_get_attr)(u32 group, u64 attr, u64 *val); int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp); + int (*vcpu_mem_enc_ioctl)(struct kvm_vcpu *vcpu, void __user *argp); int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp); int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp); int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd); diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index b64351076f2a..9316afbd4a88 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -931,6 +931,7 @@ struct kvm_hyperv_eventfd { enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, KVM_TDX_INIT_VM, + KVM_TDX_INIT_VCPU, KVM_TDX_CMD_NR_MAX, }; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index a009c94c26c2..a1cbca31ec30 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2657,6 +2657,7 @@ int kvm_apic_set_base(struct kvm_vcpu *vcpu, u64 value, bool host_initiated) kvm_recalculate_apic_map(vcpu->kvm); return 0; } +EXPORT_SYMBOL_GPL(kvm_apic_set_base); void kvm_apic_update_apicv(struct kvm_vcpu *vcpu) { diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index d06f9f0519bd..d78f0cf3dd9c 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -106,6 +106,14 @@ static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) return tdx_vm_ioctl(kvm, argp); } +static int vt_vcpu_mem_enc_ioctl(struct kvm_vcpu *vcpu, void __user *argp) +{ + if (!is_td_vcpu(vcpu)) + return -EINVAL; + + return tdx_vcpu_ioctl(vcpu, argp); +} + #define VMX_REQUIRED_APICV_INHIBITS \ (BIT(APICV_INHIBIT_REASON_DISABLED) | \ BIT(APICV_INHIBIT_REASON_ABSENT) | \ @@ -262,6 +270,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .get_untagged_addr = vmx_get_untagged_addr, .mem_enc_ioctl = vt_mem_enc_ioctl, + .vcpu_mem_enc_ioctl = vt_vcpu_mem_enc_ioctl, }; struct kvm_x86_init_ops vt_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 4ab5c994d877..7dfb0b486dd9 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -424,6 +424,7 @@ int tdx_vm_init(struct kvm *kvm) int tdx_vcpu_create(struct kvm_vcpu *vcpu) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct vcpu_tdx *tdx = to_tdx(vcpu); if (kvm_tdx->state != TD_STATE_INITIALIZED) return -EIO; @@ -447,12 +448,42 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) if ((kvm_tdx->xfam & XFEATURE_MASK_XTILE) == XFEATURE_MASK_XTILE) vcpu->arch.xfd_no_write_intercept = true; + tdx->state = VCPU_TD_STATE_UNINITIALIZED; + return 0; } void tdx_vcpu_free(struct kvm_vcpu *vcpu) { - /* This is stub for now. More logic will come. */ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct vcpu_tdx *tdx = to_tdx(vcpu); + int i; + + /* + * It is not possible to reclaim pages while hkid is assigned. It might + * be assigned if: + * 1. the TD VM is being destroyed but freeing hkid failed, in which + * case the pages are leaked + * 2. TD VCPU creation failed and this on the error path, in which case + * there is nothing to do anyway + */ + if (is_hkid_assigned(kvm_tdx)) + return; + + if (tdx->vp.tdcx_pages) { + for (i = 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { + if (tdx->vp.tdcx_pages[i]) + tdx_reclaim_control_page(tdx->vp.tdcx_pages[i]); + } + kfree(tdx->vp.tdcx_pages); + tdx->vp.tdcx_pages = NULL; + } + if (tdx->vp.tdvpr_page) { + tdx_reclaim_control_page(tdx->vp.tdvpr_page); + tdx->vp.tdvpr_page = 0; + } + + tdx->state = VCPU_TD_STATE_UNINITIALIZED; } static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd) @@ -662,6 +693,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, goto free_hkid; kvm_tdx->td.tdcs_nr_pages = tdx_sysinfo->td_ctrl.tdcs_base_size / PAGE_SIZE; + /* TDVPS = TDVPR(4K page) + TDCX(multiple 4K pages), -1 for TDVPR. */ + kvm_tdx->td.tdcx_nr_pages = tdx_sysinfo->td_ctrl.tdvps_base_size / PAGE_SIZE - 1; tdcs_pages = kcalloc(kvm_tdx->td.tdcs_nr_pages, sizeof(*kvm_tdx->td.tdcs_pages), GFP_KERNEL | __GFP_ZERO); if (!tdcs_pages) @@ -938,6 +971,143 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) return r; } +/* VMM can pass one 64bit auxiliary data to vcpu via RCX for guest BIOS. */ +static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u64 vcpu_rcx) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct vcpu_tdx *tdx = to_tdx(vcpu); + struct page *page; + int ret, i; + u64 err; + + page = alloc_page(GFP_KERNEL); + if (!page) + return -ENOMEM; + tdx->vp.tdvpr_page = page; + + tdx->vp.tdcx_pages = kcalloc(kvm_tdx->td.tdcx_nr_pages, sizeof(*tdx->vp.tdcx_pages), + GFP_KERNEL); + if (!tdx->vp.tdcx_pages) { + ret = -ENOMEM; + goto free_tdvpr; + } + + for (i = 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { + page = alloc_page(GFP_KERNEL); + if (!page) { + ret = -ENOMEM; + goto free_tdcx; + } + tdx->vp.tdcx_pages[i] = page; + } + + err = tdh_vp_create(&kvm_tdx->td, &tdx->vp); + if (KVM_BUG_ON(err, vcpu->kvm)) { + ret = -EIO; + pr_tdx_error(TDH_VP_CREATE, err); + goto free_tdcx; + } + + for (i = 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { + err = tdh_vp_addcx(&tdx->vp, tdx->vp.tdcx_pages[i]); + if (KVM_BUG_ON(err, vcpu->kvm)) { + pr_tdx_error(TDH_VP_ADDCX, err); + /* + * Pages already added are reclaimed by the vcpu_free + * method, but the rest are freed here. + */ + for (; i < kvm_tdx->td.tdcx_nr_pages; i++) { + __free_page(tdx->vp.tdcx_pages[i]); + tdx->vp.tdcx_pages[i] = NULL; + } + return -EIO; + } + } + + err = tdh_vp_init(&tdx->vp, vcpu_rcx, vcpu->vcpu_id); + if (KVM_BUG_ON(err, vcpu->kvm)) { + pr_tdx_error(TDH_VP_INIT, err); + return -EIO; + } + + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; + + return 0; + +free_tdcx: + for (i = 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { + if (tdx->vp.tdcx_pages[i]) + __free_page(tdx->vp.tdcx_pages[i]); + tdx->vp.tdcx_pages[i] = NULL; + } + kfree(tdx->vp.tdcx_pages); + tdx->vp.tdcx_pages = NULL; + +free_tdvpr: + if (tdx->vp.tdvpr_page) + __free_page(tdx->vp.tdvpr_page); + tdx->vp.tdvpr_page = 0; + + return ret; +} + +static int tdx_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd) +{ + u64 apic_base; + struct vcpu_tdx *tdx = to_tdx(vcpu); + int ret; + + if (cmd->flags) + return -EINVAL; + + if (tdx->state != VCPU_TD_STATE_UNINITIALIZED) + return -EINVAL; + + /* + * TDX requires X2APIC, userspace is responsible for configuring guest + * CPUID accordingly. + */ + apic_base = APIC_DEFAULT_PHYS_BASE | LAPIC_MODE_X2APIC | + (kvm_vcpu_is_reset_bsp(vcpu) ? MSR_IA32_APICBASE_BSP : 0); + if (kvm_apic_set_base(vcpu, apic_base, true)) + return -EINVAL; + + ret = tdx_td_vcpu_init(vcpu, (u64)cmd->data); + if (ret) + return ret; + + tdx->state = VCPU_TD_STATE_INITIALIZED; + + return 0; +} + +int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct kvm_tdx_cmd cmd; + int ret; + + if (!is_hkid_assigned(kvm_tdx) || kvm_tdx->state == TD_STATE_RUNNABLE) + return -EINVAL; + + if (copy_from_user(&cmd, argp, sizeof(cmd))) + return -EFAULT; + + if (cmd.hw_error) + return -EINVAL; + + switch (cmd.id) { + case KVM_TDX_INIT_VCPU: + ret = tdx_vcpu_init(vcpu, &cmd); + break; + default: + ret = -EINVAL; + break; + } + + return ret; +} + static int tdx_online_cpu(unsigned int cpu) { unsigned long flags; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index f12854c8ff07..c3bde94c19dc 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -32,9 +32,18 @@ struct kvm_tdx { struct tdx_td td; }; +/* TDX module vCPU states */ +enum vcpu_tdx_state { + VCPU_TD_STATE_UNINITIALIZED = 0, + VCPU_TD_STATE_INITIALIZED, +}; + struct vcpu_tdx { struct kvm_vcpu vcpu; - /* TDX specific members follow. */ + + struct tdx_vp vp; + + enum vcpu_tdx_state state; }; static inline bool is_td(struct kvm *kvm) diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 6e1ff7d88b61..499566441717 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -130,6 +130,8 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_create(struct kvm_vcpu *vcpu); void tdx_vcpu_free(struct kvm_vcpu *vcpu); + +int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); #else static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} @@ -139,6 +141,8 @@ static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOP static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} + +static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } #endif #endif /* __KVM_X86_VMX_X86_OPS_H */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 007195d9ab0b..70a33c0a6e6e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6275,6 +6275,12 @@ long kvm_arch_vcpu_ioctl(struct file *filp, case KVM_SET_DEVICE_ATTR: r = kvm_vcpu_ioctl_device_attr(vcpu, ioctl, argp); break; + case KVM_MEMORY_ENCRYPT_OP: + r = -ENOTTY; + if (!kvm_x86_ops.vcpu_mem_enc_ioctl) + goto out; + r = kvm_x86_ops.vcpu_mem_enc_ioctl(vcpu, argp); + break; default: r = -EINVAL; } @@ -12660,6 +12666,7 @@ bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu) { return vcpu->kvm->arch.bsp_vcpu_id == vcpu->vcpu_id; } +EXPORT_SYMBOL_GPL(kvm_vcpu_is_reset_bsp); bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu) { From patchwork Thu Feb 20 17:06:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984325 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B069922578E for ; Thu, 20 Feb 2025 17:06:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071216; cv=none; b=KySuOK6uSydpWCruMmS1yvg1x3WjCJ2/ILDMrQm+3wcUyLExHilGPkUI2N4c3nfUZBAs09FvFVJjAcETn43tVcqxj38YRNDtl09GLPIBCtQDG4Kp8r/+uV2XoZ7G5dBBZm3jvp8qXrK0Fog9wZ6djUnWg0DC/3wo/+HgUxvNki8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071216; c=relaxed/simple; bh=aMUphNUgwDKVfyLvoSO130Sb6SyuzmF+GoT1s4np2+8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qT210YPNjdamipFd8N51LlA6dwPWntXl9Bs3thGWEpkb/AFyKlF3prSmcmqfMEXwZFuLOmN3ge4YuebeV+e+Jk1ezMPK24BEdiNtSsb5ECI39RLCDbqmfdcs7T+3Bq6vobD7uOX2Z7cnr87s7a1JFVImXL754uh0BEeJhCixYkU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=A6/WpAm6; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="A6/WpAm6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KjWv5fSjEg41LoJNB4shPeDKoZz0HqWV9KX3dmsD1Xg=; b=A6/WpAm6iaN9jIyqOyzsqcUbQC0lg6ncfAo3yXfZgesm0bM9rG++v4aToiCVzpOMDx7ium PgHwtyftoXHnXnKvEphpPrW6yWq9DIClRfCIEDKQD2VMtLSPrKuow15mfXzwXiHIzm6mqJ b8zmwitqzfQKW/N1xbFJySe3LDyfG/0= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-226-V_3Sx4BTPCKiY-Lx_FNbJw-1; Thu, 20 Feb 2025 12:06:49 -0500 X-MC-Unique: V_3Sx4BTPCKiY-Lx_FNbJw-1 X-Mimecast-MFC-AGG-ID: V_3Sx4BTPCKiY-Lx_FNbJw_1740071207 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CB00F1800874; Thu, 20 Feb 2025 17:06:47 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B302E19412A3; Thu, 20 Feb 2025 17:06:46 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Xiaoyao Li , Tony Lindgren Subject: [PATCH 28/30] KVM: x86: Introduce KVM_TDX_GET_CPUID Date: Thu, 20 Feb 2025 12:06:02 -0500 Message-ID: <20250220170604.2279312-29-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Xiaoyao Li Implement an IOCTL to allow userspace to read the CPUID bit values for a configured TD. The TDX module doesn't provide the ability to set all CPUID bits. Instead some are configured indirectly, or have fixed values. But it does allow for the final resulting CPUID bits to be read. This information will be useful for userspace to understand the configuration of the TD, and set KVM's copy via KVM_SET_CPUID2. Signed-off-by: Xiaoyao Li Co-developed-by: Tony Lindgren Signed-off-by: Tony Lindgren Signed-off-by: Rick Edgecombe --- - Fix subleaf mask check (Binbin) - Search all possible sub-leafs (Francesco Lavra) - Reduce off-by-one error sensitve code (Francesco, Xiaoyao) - Handle buffers too small from userspace (Xiaoyao) - Read max CPUID from TD instead of using fixed values (Xiaoyao) Signed-off-by: Paolo Bonzini --- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/vmx/tdx.c | 191 ++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx_arch.h | 5 + arch/x86/kvm/vmx/tdx_errno.h | 1 + 4 files changed, 198 insertions(+) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 9316afbd4a88..cd55484e3f0c 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -932,6 +932,7 @@ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU, + KVM_TDX_GET_CPUID, KVM_TDX_CMD_NR_MAX, }; diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 7dfb0b486dd9..1c88f577ec2b 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3,6 +3,7 @@ #include #include #include "capabilities.h" +#include "mmu.h" #include "x86_ops.h" #include "lapic.h" #include "tdx.h" @@ -858,6 +859,103 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, return ret; } +static u64 tdx_td_metadata_field_read(struct kvm_tdx *tdx, u64 field_id, + u64 *data) +{ + u64 err; + + err = tdh_mng_rd(&tdx->td, field_id, data); + + return err; +} + +#define TDX_MD_UNREADABLE_LEAF_MASK GENMASK(30, 7) +#define TDX_MD_UNREADABLE_SUBLEAF_MASK GENMASK(31, 7) + +static int tdx_read_cpuid(struct kvm_vcpu *vcpu, u32 leaf, u32 sub_leaf, + bool sub_leaf_set, int *entry_index, + struct kvm_cpuid_entry2 *out) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + u64 field_id = TD_MD_FIELD_ID_CPUID_VALUES; + u64 ebx_eax, edx_ecx; + u64 err = 0; + + if (sub_leaf > 0b1111111) + return -EINVAL; + + if (*entry_index >= KVM_MAX_CPUID_ENTRIES) + return -EINVAL; + + if (leaf & TDX_MD_UNREADABLE_LEAF_MASK || + sub_leaf & TDX_MD_UNREADABLE_SUBLEAF_MASK) + return -EINVAL; + + /* + * bit 23:17, REVSERVED: reserved, must be 0; + * bit 16, LEAF_31: leaf number bit 31; + * bit 15:9, LEAF_6_0: leaf number bits 6:0, leaf bits 30:7 are + * implicitly 0; + * bit 8, SUBLEAF_NA: sub-leaf not applicable flag; + * bit 7:1, SUBLEAF_6_0: sub-leaf number bits 6:0. If SUBLEAF_NA is 1, + * the SUBLEAF_6_0 is all-1. + * sub-leaf bits 31:7 are implicitly 0; + * bit 0, ELEMENT_I: Element index within field; + */ + field_id |= ((leaf & 0x80000000) ? 1 : 0) << 16; + field_id |= (leaf & 0x7f) << 9; + if (sub_leaf_set) + field_id |= (sub_leaf & 0x7f) << 1; + else + field_id |= 0x1fe; + + err = tdx_td_metadata_field_read(kvm_tdx, field_id, &ebx_eax); + if (err) //TODO check for specific errors + goto err_out; + + out->eax = (u32) ebx_eax; + out->ebx = (u32) (ebx_eax >> 32); + + field_id++; + err = tdx_td_metadata_field_read(kvm_tdx, field_id, &edx_ecx); + /* + * It's weird that reading edx_ecx fails while reading ebx_eax + * succeeded. + */ + if (WARN_ON_ONCE(err)) + goto err_out; + + out->ecx = (u32) edx_ecx; + out->edx = (u32) (edx_ecx >> 32); + + out->function = leaf; + out->index = sub_leaf; + out->flags |= sub_leaf_set ? KVM_CPUID_FLAG_SIGNIFCANT_INDEX : 0; + + /* + * Work around missing support on old TDX modules, fetch + * guest maxpa from gfn_direct_bits. + */ + if (leaf == 0x80000008) { + gpa_t gpa_bits = gfn_to_gpa(kvm_gfn_direct_bits(vcpu->kvm)); + unsigned int g_maxpa = __ffs(gpa_bits) + 1; + + out->eax = tdx_set_guest_phys_addr_bits(out->eax, g_maxpa); + } + + (*entry_index)++; + + return 0; + +err_out: + out->eax = 0; + out->ebx = 0; + out->ecx = 0; + out->edx = 0; + + return -EIO; +} + static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); @@ -1051,6 +1149,96 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u64 vcpu_rcx) return ret; } +/* Sometimes reads multipple subleafs. Return how many enties were written. */ +static int tdx_vcpu_get_cpuid_leaf(struct kvm_vcpu *vcpu, u32 leaf, int *entry_index, + struct kvm_cpuid_entry2 *output_e) +{ + int sub_leaf = 0; + int ret; + + /* First try without a subleaf */ + ret = tdx_read_cpuid(vcpu, leaf, 0, false, entry_index, output_e); + + /* If success, or invalid leaf, just give up */ + if (ret != -EIO) + return ret; + + /* + * If the try without a subleaf failed, try reading subleafs until + * failure. The TDX module only supports 6 bits of subleaf index. + */ + while (1) { + /* Keep reading subleafs until there is a failure. */ + if (tdx_read_cpuid(vcpu, leaf, sub_leaf, true, entry_index, output_e)) + return !sub_leaf; + + sub_leaf++; + output_e++; + } + + return 0; +} + +static int tdx_vcpu_get_cpuid(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd) +{ + struct kvm_cpuid2 __user *output, *td_cpuid; + int r = 0, i = 0, leaf; + u32 level; + + output = u64_to_user_ptr(cmd->data); + td_cpuid = kzalloc(sizeof(*td_cpuid) + + sizeof(output->entries[0]) * KVM_MAX_CPUID_ENTRIES, + GFP_KERNEL); + if (!td_cpuid) + return -ENOMEM; + + if (copy_from_user(td_cpuid, output, sizeof(*output))) { + r = -EFAULT; + goto out; + } + + /* Read max CPUID for normal range */ + if (tdx_vcpu_get_cpuid_leaf(vcpu, 0, &i, &td_cpuid->entries[i])) { + r = -EIO; + goto out; + } + level = td_cpuid->entries[0].eax; + + for (leaf = 1; leaf <= level; leaf++) + tdx_vcpu_get_cpuid_leaf(vcpu, leaf, &i, &td_cpuid->entries[i]); + + /* Read max CPUID for extended range */ + if (tdx_vcpu_get_cpuid_leaf(vcpu, 0x80000000, &i, &td_cpuid->entries[i])) { + r = -EIO; + goto out; + } + level = td_cpuid->entries[i - 1].eax; + + for (leaf = 0x80000001; leaf <= level; leaf++) + tdx_vcpu_get_cpuid_leaf(vcpu, leaf, &i, &td_cpuid->entries[i]); + + if (td_cpuid->nent < i) + r = -E2BIG; + td_cpuid->nent = i; + + if (copy_to_user(output, td_cpuid, sizeof(*output))) { + r = -EFAULT; + goto out; + } + + if (r == -E2BIG) + goto out; + + if (copy_to_user(output->entries, td_cpuid->entries, + td_cpuid->nent * sizeof(struct kvm_cpuid_entry2))) + r = -EFAULT; + +out: + kfree(td_cpuid); + + return r; +} + static int tdx_vcpu_init(struct kvm_vcpu *vcpu, struct kvm_tdx_cmd *cmd) { u64 apic_base; @@ -1100,6 +1288,9 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) case KVM_TDX_INIT_VCPU: ret = tdx_vcpu_init(vcpu, &cmd); break; + case KVM_TDX_GET_CPUID: + ret = tdx_vcpu_get_cpuid(vcpu, &cmd); + break; default: ret = -EINVAL; break; diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index cb9a638fa398..b6fe458a4224 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -122,4 +122,9 @@ struct td_params { #define MD_FIELD_ID_FEATURES0_TOPOLOGY_ENUM BIT_ULL(20) +/* + * TD scope metadata field ID. + */ +#define TD_MD_FIELD_ID_CPUID_VALUES 0x9410000300000000ULL + #endif /* __KVM_X86_TDX_ARCH_H */ diff --git a/arch/x86/kvm/vmx/tdx_errno.h b/arch/x86/kvm/vmx/tdx_errno.h index dc3fa2a58c2c..f9dbb3a065cc 100644 --- a/arch/x86/kvm/vmx/tdx_errno.h +++ b/arch/x86/kvm/vmx/tdx_errno.h @@ -23,6 +23,7 @@ #define TDX_FLUSHVP_NOT_DONE 0x8000082400000000ULL #define TDX_EPT_WALK_FAILED 0xC0000B0000000000ULL #define TDX_EPT_ENTRY_STATE_INCORRECT 0xC0000B0D00000000ULL +#define TDX_METADATA_FIELD_NOT_READABLE 0xC0000C0200000000ULL /* * TDX module operand ID, appears in 31:0 part of error code as From patchwork Thu Feb 20 17:06:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984327 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 504EE212D69 for ; Thu, 20 Feb 2025 17:07:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071227; cv=none; b=qaWU6DdT8IDoF/8PR1IvSVH2K7FTA3aYKojU3za3OPLsVueFxixynqSzLsg9hz8ghszE7HI7k0KfJq07EoEQwRZd5zzjtdzagTo7Je/VD9pBSUmi4unnYjnkh7Eo6CJam4r/RxAFO7FZp8v+cGL4qB9cxEKqngMxxxq6eWCD9Vo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071227; c=relaxed/simple; bh=WjyPZsuYt/J1ZXDdahnzCeLRanspFNYQ0w02lSCEdig=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dvcWlp0Mcf3Mww9h6cNfAxHj0iHshk1QtHebyLeLSCtJ1tbkqoZjTLyB8MoU0yq86oVEkwmP8jKvr2aj2PKyvtxR9F72Nq6ioJ29zyD+H2tI9EK8VxYgTJPrghUXl4qsF3HEr49FYDEG4B3JERDksK3gaCiz59zGGo7JmkO/QE4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=eTZp45pS; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eTZp45pS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071225; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Yw+GWEMGktTe2EG3idQ1LE3+Xb5+a50/AkTUJ8BT5dQ=; b=eTZp45pSGSWe2V4LESuOVt3ICtQs2zqcRdmXXP4N4lfU94v1D4AHIIABMyiB0CfZ7mpOBg Dxd/5W9kgL1ftnS5pus/yQtZft/F5QvXcamiWPV21tbOqI2GTNtFYf1HbtKpMmM2sQmjV3 QBOK/7/tpZnzh8+BbXDdhsBLNy3GYoI= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-271-2GvadszLNS-7OACkx05beA-1; Thu, 20 Feb 2025 12:06:50 -0500 X-MC-Unique: 2GvadszLNS-7OACkx05beA-1 X-Mimecast-MFC-AGG-ID: 2GvadszLNS-7OACkx05beA_1740071209 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 056B31800872; Thu, 20 Feb 2025 17:06:49 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 0AF8819412A4; Thu, 20 Feb 2025 17:06:47 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Xiaoyao Li Subject: [PATCH 29/30] KVM: x86/mmu: Taking guest pa into consideration when calculate tdp level Date: Thu, 20 Feb 2025 12:06:03 -0500 Message-ID: <20250220170604.2279312-30-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Xiaoyao Li For TDX, the maxpa (CPUID.0x80000008.EAX[7:0]) is fixed as native and the max_gpa (CPUID.0x80000008.EAX[23:16]) is configurable and used to configure the EPT level and GPAW. Use max_gpa to determine the TDP level. Signed-off-by: Xiaoyao Li Signed-off-by: Rick Edgecombe Signed-off-by: Paolo Bonzini --- arch/x86/kvm/cpuid.c | 14 ++++++++++++++ arch/x86/kvm/cpuid.h | 1 + arch/x86/kvm/mmu/mmu.c | 9 ++++++++- 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 936c5dd13bd5..7ff84079c1f4 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -494,6 +494,20 @@ int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu) return 36; } +int cpuid_query_maxguestphyaddr(struct kvm_vcpu *vcpu) +{ + struct kvm_cpuid_entry2 *best; + + best = kvm_find_cpuid_entry(vcpu, 0x80000000); + if (!best || best->eax < 0x80000008) + goto not_found; + best = kvm_find_cpuid_entry(vcpu, 0x80000008); + if (best) + return (best->eax >> 16) & 0xff; +not_found: + return 0; +} + /* * This "raw" version returns the reserved GPA bits without any adjustments for * encryption technologies that usurp bits. The raw mask should be used if and diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index 91bde94519f9..46e059b47825 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -37,6 +37,7 @@ void __init kvm_init_xstate_sizes(void); u32 xstate_required_size(u64 xstate_bv, bool compacted); int cpuid_query_maxphyaddr(struct kvm_vcpu *vcpu); +int cpuid_query_maxguestphyaddr(struct kvm_vcpu *vcpu); u64 kvm_vcpu_reserved_gpa_bits_raw(struct kvm_vcpu *vcpu); static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index d4ac4a1f8b81..c1c54a7a7735 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5416,12 +5416,19 @@ void __kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu) { + int maxpa; + + if (vcpu->kvm->arch.vm_type == KVM_X86_TDX_VM) + maxpa = cpuid_query_maxguestphyaddr(vcpu); + else + maxpa = cpuid_maxphyaddr(vcpu); + /* tdp_root_level is architecture forced level, use it if nonzero */ if (tdp_root_level) return tdp_root_level; /* Use 5-level TDP if and only if it's useful/necessary. */ - if (max_tdp_level == 5 && cpuid_maxphyaddr(vcpu) <= 48) + if (max_tdp_level == 5 && maxpa <= 48) return 4; return max_tdp_level; From patchwork Thu Feb 20 17:06:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 13984326 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33AA722A4F1 for ; Thu, 20 Feb 2025 17:06:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071219; cv=none; b=ZM5WSCLh2h9TG+ky8NI8KKHaQNTHH1iKVn8uQ/w/6sSZJhgZ1PgDTbM9tigtEebTm3j5pWJyTTP6/fzcU2f+pHKrKvpjm2RyForNyyQuxgFSVILQ8/SiHqAfTKHhwLi1wh+koRav+yHhJjhgKHf3Qkq0FGu1E6PQX4RTfzPUxWE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740071219; c=relaxed/simple; bh=zwW1fa4b2WOaXd0TSba/wk7mWtshg3WgjRIrZ7ypJmE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=AwSdnKDE7vjLFUvSMR3BmJGRYaGY3WEEyZRF1pb9Ri+iBAjnO2sNKGoAGfPiwc5PozFJv2lnCf9KInjfIPqO8PzFhvVidSqRImjUgipdLUkkRD2FFlstL/mhJFsjxxGnWDoQDGhmxYGlTXo2IexZxD7wv3UofFSuCSPKQ/RJTCA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UgHMZ9Yl; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UgHMZ9Yl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1740071217; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9A7fliY0YRXdz2jG0qB4m1rYmz3YhV2HEkf3OmiphRo=; b=UgHMZ9YlJHXwO2VLXcxQ1wqduRhkBTuM4HrTQKYCp3Q5yRtptB0ha+syBZ6vqWfR56Gfc0 r4RCYdg0gjzzOQuspCAjL6wqlBWOZsMEOZNgA/7fmUfKTip3IXR+WyH+EMZFfUU3F3Erf0 SptreVDS9F9kovfAf92OUG9WTv3wvh8= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-313-_gPyy4IXOjSlbdXdp0wBig-1; Thu, 20 Feb 2025 12:06:52 -0500 X-MC-Unique: _gPyy4IXOjSlbdXdp0wBig-1 X-Mimecast-MFC-AGG-ID: _gPyy4IXOjSlbdXdp0wBig_1740071211 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 08DB01800982; Thu, 20 Feb 2025 17:06:51 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E9F6C180087E; Thu, 20 Feb 2025 17:06:49 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: seanjc@google.com, Yan Zhao , Rick Edgecombe , Zhiming Hu , Isaku Yamahata Subject: [PATCH 30/30] KVM: TDX: Register TDX host key IDs to cgroup misc controller Date: Thu, 20 Feb 2025 12:06:04 -0500 Message-ID: <20250220170604.2279312-31-pbonzini@redhat.com> In-Reply-To: <20250220170604.2279312-1-pbonzini@redhat.com> References: <20250220170604.2279312-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 From: Zhiming Hu TDX host key IDs (HKID) are limit resources in a machine, and the misc cgroup lets the machine owner track their usage and limits the possibility of abusing them outside the owner's control. The cgroup v2 miscellaneous subsystem was introduced to control the resource of AMD SEV & SEV-ES ASIDs. Likewise introduce HKIDs as a misc resource. Signed-off-by: Zhiming Hu Signed-off-by: Isaku Yamahata Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/tdx.h | 2 ++ arch/x86/kvm/vmx/tdx.c | 14 ++++++++++++++ arch/x86/kvm/vmx/tdx.h | 1 + arch/x86/virt/vmx/tdx/tdx.c | 6 ++++++ include/linux/misc_cgroup.h | 4 ++++ kernel/cgroup/misc.c | 4 ++++ 6 files changed, 31 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 52a21075c0a6..7dd71ca3eb57 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -124,6 +124,7 @@ const char *tdx_dump_mce_info(struct mce *m); const struct tdx_sys_info *tdx_get_sysinfo(void); int tdx_guest_keyid_alloc(void); +u32 tdx_get_nr_guest_keyids(void); void tdx_guest_keyid_free(unsigned int keyid); struct tdx_td { @@ -179,6 +180,7 @@ u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td); static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } static inline int tdx_enable(void) { return -ENODEV; } +static u32 tdx_get_nr_guest_keyids(void) { return 0; } static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; } static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NULL; } #endif /* CONFIG_INTEL_TDX_HOST */ diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 1c88f577ec2b..a2355eed8f5a 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include #include "capabilities.h" #include "mmu.h" @@ -143,6 +144,9 @@ static inline void tdx_hkid_free(struct kvm_tdx *kvm_tdx) tdx_guest_keyid_free(kvm_tdx->hkid); kvm_tdx->hkid = -1; atomic_dec(&nr_configured_hkid); + misc_cg_uncharge(MISC_CG_RES_TDX, kvm_tdx->misc_cg, 1); + put_misc_cg(kvm_tdx->misc_cg); + kvm_tdx->misc_cg = NULL; } static inline bool is_hkid_assigned(struct kvm_tdx *kvm_tdx) @@ -684,6 +688,10 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params, if (ret < 0) return ret; kvm_tdx->hkid = ret; + kvm_tdx->misc_cg = get_current_misc_cg(); + ret = misc_cg_try_charge(MISC_CG_RES_TDX, kvm_tdx->misc_cg, 1); + if (ret) + goto free_hkid; ret = -ENOMEM; @@ -1465,6 +1473,11 @@ static int __init __tdx_bringup(void) goto get_sysinfo_err; } + if (misc_cg_set_capacity(MISC_CG_RES_TDX, tdx_get_nr_guest_keyids())) { + r = -EINVAL; + goto get_sysinfo_err; + } + /* * Leave hardware virtualization enabled after TDX is enabled * successfully. TDX CPU hotplug depends on this. @@ -1481,6 +1494,7 @@ static int __init __tdx_bringup(void) void tdx_cleanup(void) { if (enable_tdx) { + misc_cg_set_capacity(MISC_CG_RES_TDX, 0); __do_tdx_cleanup(); kvm_disable_virtualization(); } diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index c3bde94c19dc..5fea072b4f56 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -21,6 +21,7 @@ enum kvm_tdx_state { struct kvm_tdx { struct kvm kvm; + struct misc_cg *misc_cg; int hkid; enum kvm_tdx_state state; diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 9f0c482c1a03..3a272e9ff2ca 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1476,6 +1476,12 @@ const struct tdx_sys_info *tdx_get_sysinfo(void) } EXPORT_SYMBOL_GPL(tdx_get_sysinfo); +u32 tdx_get_nr_guest_keyids(void) +{ + return tdx_nr_guest_keyids; +} +EXPORT_SYMBOL_GPL(tdx_get_nr_guest_keyids); + int tdx_guest_keyid_alloc(void) { return ida_alloc_range(&tdx_guest_keyid_pool, tdx_guest_keyid_start, diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h index 49eef10c8e59..8c0e4f4d71be 100644 --- a/include/linux/misc_cgroup.h +++ b/include/linux/misc_cgroup.h @@ -17,6 +17,10 @@ enum misc_res_type { MISC_CG_RES_SEV, /** @MISC_CG_RES_SEV_ES: AMD SEV-ES ASIDs resource */ MISC_CG_RES_SEV_ES, +#endif +#ifdef CONFIG_INTEL_TDX_HOST + /* Intel TDX HKIDs resource */ + MISC_CG_RES_TDX, #endif /** @MISC_CG_RES_TYPES: count of enum misc_res_type constants */ MISC_CG_RES_TYPES diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index 0e26068995a6..264aad22c967 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -24,6 +24,10 @@ static const char *const misc_res_name[] = { /* AMD SEV-ES ASIDs resource */ "sev_es", #endif +#ifdef CONFIG_INTEL_TDX_HOST + /* Intel TDX HKIDs resource */ + "tdx", +#endif }; /* Root misc cgroup */