From patchwork Mon Feb 13 11:59:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Kai" X-Patchwork-Id: 13138306 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35737C636D4 for ; Mon, 13 Feb 2023 12:01:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBD226B0078; Mon, 13 Feb 2023 07:01:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BF98D6B007B; Mon, 13 Feb 2023 07:01:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A228A6B007E; Mon, 13 Feb 2023 07:01:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8E8CF6B0078 for ; Mon, 13 Feb 2023 07:01:32 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 52748815FE for ; Mon, 13 Feb 2023 12:01:32 +0000 (UTC) X-FDA: 80462128824.22.A1CF3CD Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf27.hostedemail.com (Postfix) with ESMTP id EF07840024 for ; Mon, 13 Feb 2023 12:01:29 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cgliCOfl; spf=pass (imf27.hostedemail.com: domain of kai.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=kai.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676289690; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Tr7vtxE7jDcbxprf4UmD76gcQ+Qef0jvU/HjSIe2whY=; b=7w52GmC3R34ecbOdle1xmO3T9k6dwS8iICBUa5vKpla/RbU0xVnzGvFoz4Zb9xEXWet4uS DnlQE8UkNUfD8Xs2jiA3GMIBV65QbABhYoD52gsv8QYe2j4kak+IiQnCVex9Hf3FG5f1WV MhnMtbTFBWLhnn/93SY0oBz67SyZlSs= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cgliCOfl; spf=pass (imf27.hostedemail.com: domain of kai.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=kai.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676289690; a=rsa-sha256; cv=none; b=2GE+2uJkJCWIWOPvACOtfAnfXMZ62djE+fjpS5v1vz0hZfBphaxvE7fH7eUQZZ31FT6b7O gzww34chhAN2lfQHv2rJRFldqTTuuyjWdOrWW2AqEZLjWuiaUVtgmwSS0hDMTvLx/3OQaM xI1buPaJQ54H4IrcFKugdvHxa3GJJMU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676289690; x=1707825690; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fHOs3sHTQVlPGDUyZEr3B2p6A0ySFsXFodjDNyu3joU=; b=cgliCOflXFLsPjB4ibGAWpBU1Rm1TOS/sPJgszJ1gTqZig3uS1gosdJU h9Gbh4p01mIV/mKGt1tyRpBAONibiEqbF/7HR8Ur883X0uq1zOUlR4J4y ARkakYZ17i0uxOVcScjWzv8OYMi1Nmpw41nuthZBVLhTFMJO8J1l5OCSk W0WkjY0iyNgXg5ryzPhMomdbLRUZvvAT12h9DigLv9qDW3oy0j7qZNWaP U9YFIob8MGTzA5AFQ9IGfJ2x06bLLtkb+MNiUCx0kjfiS4BgMMdglAJ9T COEAh5B0h8IXQCQTDtW2tnGczZstO3TtN7kq+IZsGQ2d5KPi/kqYc9yWM Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10619"; a="358283371" X-IronPort-AV: E=Sophos;i="5.97,293,1669104000"; d="scan'208";a="358283371" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2023 04:01:10 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10619"; a="701243403" X-IronPort-AV: E=Sophos;i="5.97,293,1669104000"; d="scan'208";a="701243403" Received: from wonger-mobl.amr.corp.intel.com (HELO khuang2-desk.gar.corp.intel.com) ([10.209.188.34]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2023 04:01:05 -0800 From: Kai Huang To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: linux-mm@kvack.org, dave.hansen@intel.com, peterz@infradead.org, tglx@linutronix.de, seanjc@google.com, pbonzini@redhat.com, dan.j.williams@intel.com, rafael.j.wysocki@intel.com, kirill.shutemov@linux.intel.com, ying.huang@intel.com, reinette.chatre@intel.com, len.brown@intel.com, tony.luck@intel.com, ak@linux.intel.com, isaku.yamahata@intel.com, chao.gao@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com, david@redhat.com, bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com, kai.huang@intel.com Subject: [PATCH v9 11/18] x86/virt/tdx: Fill out TDMRs to cover all TDX memory regions Date: Tue, 14 Feb 2023 00:59:18 +1300 Message-Id: <2de22607f9e00d6b9beb3ca9922c30911650c2c1.1676286526.git.kai.huang@intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: EF07840024 X-Stat-Signature: 98dw4nroe4s1pe39z7kkb8k5c1ixdfmk X-Rspam-User: X-HE-Tag: 1676289689-638328 X-HE-Meta: U2FsdGVkX18WxFPzEkz3GGuQz/2I241VHdfOyDOa/MBMHAhT+L7zhPa/Alvu5I1uxTyjIInOUFuIZ0Q9OJCHqDgKkOo9sNHal1w60+1FLVFzEByc0guaQkHAHMZSFJpxBQ4b6xJ5mlP1LRoDFSftVGk/gG21LWnGDWrVxvTQsaui1qoU/DBlYlxT8A/Uc3EYSxDDxeOVKqqjrZZPYmoXzA8S44BlH4StPoqabtxhgNMbsalQolkS2ViEvWWIGkffdf33txqu5l2xZOsY+TfaRn3gwVi7G7ASjQh8fvC8giahxJJnEbouvNOAxuTzECVcplvDdreu5HqvXbzzCM7cI7t9KcaYXJXkasDX0oFn790HGGkQfIU5BJtHo7u5KFkpl7gUnNFBXQyPwjoiNa/FysAiFC/PJLPFXqY7ASBkj32eYgkP39FqBmtvJXYA9rq6xE4lybFM+P2+HjtmLsZ5+yeU7e2zAVE9tLmGpVGQb8/Bt8oVxv8Ve4IauUAybvr1emHVRxJwJtb+i9hezn+G7AR0seUyTNrS5DpvKVmXltz6IbSPK1aZZvcKNy4l7LnkjLN29aruTmTbtM4MXhbkSoENACvuIpz47r2iiGxf0XqMyOky4rur/KIMlX033ckUSAcWWdnhA9b82BrMGiGw15s96GYZtydU1ehTzK9DPsRTAsqYaDGobeiiq6Qr8yU90cuJ7RlnIgGi3I9t0GTc46oW6hxviPtQ5LGuk9sJWp6emIb7/M5Y50rGooir72sVYN29xBikHlPvWKWjr+E+ODWVNeqS/OHx3rl0owxLGW+Siyqbb8F756+98YTTbyrca/lXgLG/9wDT0YnKq+daHGSb0iEPUCBKCxM3UgfuAT6A2ehRlTvpVZtkf4bEoEFUooh7gVMX2P7G+jbNpheEoCKnW0Eh7n4WmQX6ub0U7dtRGDKr9KTulvPP8tFYtr71cU8+um3Gf0RO3Sb20G3 TecLjFlS EN9+RVj3px7Ls21Q+UPCQnwrlvy85RPuEzCZHHQZcasv1Yv/SgRURNs31h9RhgYsbiB0U0ngEsFjv1dXw1ZnvMTbockHDuKXMiYCvwu1KzEZc1QohAtSjZHlrwA6gnvTwXczvY3HF09oYSpvcqesViTscOCrFu5XGB921nWLPkyb+4kU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Start to transit out the "multi-steps" to construct a list of "TD Memory Regions" (TDMRs) to cover all TDX-usable memory regions. The kernel configures TDX-usable memory regions by passing a list of TDMRs "TD Memory Regions" (TDMRs) to the TDX module. Each TDMR contains the information of the base/size of a memory region, the base/size of the associated Physical Address Metadata Table (PAMT) and a list of reserved areas in the region. Do the first step to fill out a number of TDMRs to cover all TDX memory regions. To keep it simple, always try to use one TDMR for each memory region. As the first step only set up the base/size for each TDMR. Each TDMR must be 1G aligned and the size must be in 1G granularity. This implies that one TDMR could cover multiple memory regions. If a memory region spans the 1GB boundary and the former part is already covered by the previous TDMR, just use a new TDMR for the remaining part. TDX only supports a limited number of TDMRs. Disable TDX if all TDMRs are consumed but there is more memory region to cover. There are fancier things that could be done like trying to merge adjacent TDMRs. This would allow more pathological memory layouts to be supported. But, current systems are not even close to exhausting the existing TDMR resources in practice. For now, keep it simple. Signed-off-by: Kai Huang --- v8 -> v9: - Added the last paragraph in the changelog (Dave). - Removed unnecessary type cast in tdmr_entry() (Dave). --- arch/x86/virt/vmx/tdx/tdx.c | 94 ++++++++++++++++++++++++++++++++++++- 1 file changed, 93 insertions(+), 1 deletion(-) diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index f604e3399d03..5ff346871b4b 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -480,6 +480,93 @@ static void free_tdmr_list(struct tdmr_info_list *tdmr_list) tdmr_list->max_tdmrs * tdmr_list->tdmr_sz); } +/* Get the TDMR from the list at the given index. */ +static struct tdmr_info *tdmr_entry(struct tdmr_info_list *tdmr_list, + int idx) +{ + int tdmr_info_offset = tdmr_list->tdmr_sz * idx; + + return (void *)tdmr_list->tdmrs + tdmr_info_offset; +} + +#define TDMR_ALIGNMENT BIT_ULL(30) +#define TDMR_PFN_ALIGNMENT (TDMR_ALIGNMENT >> PAGE_SHIFT) +#define TDMR_ALIGN_DOWN(_addr) ALIGN_DOWN((_addr), TDMR_ALIGNMENT) +#define TDMR_ALIGN_UP(_addr) ALIGN((_addr), TDMR_ALIGNMENT) + +static inline u64 tdmr_end(struct tdmr_info *tdmr) +{ + return tdmr->base + tdmr->size; +} + +/* + * Take the memory referenced in @tmb_list and populate the + * preallocated @tdmr_list, following all the special alignment + * and size rules for TDMR. + */ +static int fill_out_tdmrs(struct list_head *tmb_list, + struct tdmr_info_list *tdmr_list) +{ + struct tdx_memblock *tmb; + int tdmr_idx = 0; + + /* + * Loop over TDX memory regions and fill out TDMRs to cover them. + * To keep it simple, always try to use one TDMR to cover one + * memory region. + * + * In practice TDX1.0 supports 64 TDMRs, which is big enough to + * cover all memory regions in reality if the admin doesn't use + * 'memmap' to create a bunch of discrete memory regions. When + * there's a real problem, enhancement can be done to merge TDMRs + * to reduce the final number of TDMRs. + */ + list_for_each_entry(tmb, tmb_list, list) { + struct tdmr_info *tdmr = tdmr_entry(tdmr_list, tdmr_idx); + u64 start, end; + + start = TDMR_ALIGN_DOWN(PFN_PHYS(tmb->start_pfn)); + end = TDMR_ALIGN_UP(PFN_PHYS(tmb->end_pfn)); + + /* + * A valid size indicates the current TDMR has already + * been filled out to cover the previous memory region(s). + */ + if (tdmr->size) { + /* + * Loop to the next if the current memory region + * has already been fully covered. + */ + if (end <= tdmr_end(tdmr)) + continue; + + /* Otherwise, skip the already covered part. */ + if (start < tdmr_end(tdmr)) + start = tdmr_end(tdmr); + + /* + * Create a new TDMR to cover the current memory + * region, or the remaining part of it. + */ + tdmr_idx++; + if (tdmr_idx >= tdmr_list->max_tdmrs) { + pr_warn("initialization failed: TDMRs exhausted.\n"); + return -ENOSPC; + } + + tdmr = tdmr_entry(tdmr_list, tdmr_idx); + } + + tdmr->base = start; + tdmr->size = end - start; + } + + /* @tdmr_idx is always the index of last valid TDMR. */ + tdmr_list->nr_consumed_tdmrs = tdmr_idx + 1; + + return 0; +} + /* * Construct a list of TDMRs on the preallocated space in @tdmr_list * to cover all TDX memory regions in @tmb_list based on the TDX module @@ -489,10 +576,15 @@ static int construct_tdmrs(struct list_head *tmb_list, struct tdmr_info_list *tdmr_list, struct tdsysinfo_struct *sysinfo) { + int ret; + + ret = fill_out_tdmrs(tmb_list, tdmr_list); + if (ret) + return ret; + /* * TODO: * - * - Fill out TDMRs to cover all TDX memory regions. * - Allocate and set up PAMTs for each TDMR. * - Designate reserved areas for each TDMR. *