From patchwork Wed May 25 10:06:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhiquan Li X-Patchwork-Id: 12860909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70DFAC433EF for ; Wed, 25 May 2022 10:06:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230014AbiEYKGP (ORCPT ); Wed, 25 May 2022 06:06:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242267AbiEYKFt (ORCPT ); Wed, 25 May 2022 06:05:49 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D50C393463 for ; Wed, 25 May 2022 03:05:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1653473143; x=1685009143; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=rdgEFjyUdjrPgZfxCAp33r8+6xXzryblkOzPGVg+1xA=; b=KBnhjorHS4ZhqDPrgLAE91yzpLxyomdVybxBy8yy0sU4eWwb0BcX3T1K QX4dWzetl+sf6Rw83E9NhB9We9fZcQ4iZZFuxTGrI34xOIYlnR4YhMict nczRK7zszktmXYH0B477KdBTAcJDkG75JN5Tmh7Q0O1Bw+SKdKRPq4pCZ 11nDvwFQKMugWId7CrIM9O3IbTBK/Lr7bNjmnUoHIkNIrlBdY7gQ+NN4M DKzmjypZFWDJWaNXUhZGoEsnTQ1DLwW/f377hXwHCVIGMJi+4aOg/ohVV GDxcI31vmsNAfb//+XyUmFLbR+r4wCxkVueBHTVapl42eKxF0CjFh7grj Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10357"; a="273771447" X-IronPort-AV: E=Sophos;i="5.91,250,1647327600"; d="scan'208";a="273771447" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 May 2022 03:05:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,250,1647327600"; d="scan'208";a="717572780" Received: from zhiquan-linux-dev.bj.intel.com ([10.238.155.101]) by fmsmga001.fm.intel.com with ESMTP; 25 May 2022 03:05:40 -0700 From: Zhiquan Li To: linux-sgx@vger.kernel.org, tony.luck@intel.com Cc: jarkko@kernel.org, dave.hansen@linux.intel.com, seanjc@google.com, kai.huang@intel.com, fan.du@intel.com, zhiquan1.li@intel.com Subject: [PATCH v3 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page Date: Wed, 25 May 2022 18:06:25 +0800 Message-Id: <20220525100625.760633-1-zhiquan1.li@intel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org When a page triggers a machine check, it only reports the physical address of EPC page. But in order to inject #MC into hypervisor, the virtual address is required. Then repurpose the "owner" field as the virtual address of the virtual EPC page so that arch_memory_failure() can easily retrieve it. The EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the meaning of the field. Signed-off-by: Zhiquan Li --- Changes since V2: - Rework the patch suggested by Jarkko. - Remove struct sgx_vepc_page and relevant code. - Remove new EPC page flag SGX_EPC_PAGE_IS_VEPC definition as it is duplicated to SGX_EPC_PAGE_KVM_GUEST. Link: https://lore.kernel.org/linux-sgx/eb95b32ecf3d44a695610cf7f2816785@intel.com/T/#u Changes since V1: - Add documentation suggested by Jarkko. --- arch/x86/kernel/cpu/sgx/virt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c index 8c7c1d0451c2..776ae5c1c032 100644 --- a/arch/x86/kernel/cpu/sgx/virt.c +++ b/arch/x86/kernel/cpu/sgx/virt.c @@ -46,7 +46,7 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc, if (epc_page) return 0; - epc_page = sgx_alloc_epc_page(vepc, false); + epc_page = sgx_alloc_epc_page((void *)addr, false); if (IS_ERR(epc_page)) return PTR_ERR(epc_page); From patchwork Wed May 25 10:07:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Zhiquan Li X-Patchwork-Id: 12860910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3EE5C433FE for ; Wed, 25 May 2022 10:07:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239941AbiEYKHO (ORCPT ); Wed, 25 May 2022 06:07:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240227AbiEYKHA (ORCPT ); Wed, 25 May 2022 06:07:00 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5743A7C15B for ; Wed, 25 May 2022 03:06:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1653473219; x=1685009219; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=+5SQ3SLkUcDLs98NxFNb5oqOB7AOCJUfqNsqKOnNVu0=; b=h7ouZLPUPyhyVboaAuvUO6d5CUIvOmzmB0purDL1l9G+ee5CyGkRa4a5 T1LY6t7njZmdzmcqP7ubeTCdiL93jYWgvxug1v3lWywFC/ATqe4jB9KTY UpeNp41U64UE9bIrp0+mDivTDcZcjgfcVoFFs7ErFbmGn74kvgjcEiCda 17t1mKclnc/mxGbveoI79QeVyVhA4+HFSm61e45YMrsElQDHBvXqPXD/e BPgHZUjvhJnzoGGF2clJ1E84fqJOcEwTuyKV23CS/N5VMHCxRRK7Ox2ye fxKKxo1zPYT8iY8LHwHR5XartAJ3JcM5bkeNSK2Na+QJX8xpZwwwQAe9Y A==; X-IronPort-AV: E=McAfee;i="6400,9594,10357"; a="336826568" X-IronPort-AV: E=Sophos;i="5.91,250,1647327600"; d="scan'208";a="336826568" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 May 2022 03:06:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,250,1647327600"; d="scan'208";a="717573064" Received: from zhiquan-linux-dev.bj.intel.com ([10.238.155.101]) by fmsmga001.fm.intel.com with ESMTP; 25 May 2022 03:06:46 -0700 From: Zhiquan Li To: linux-sgx@vger.kernel.org, tony.luck@intel.com Cc: jarkko@kernel.org, dave.hansen@linux.intel.com, seanjc@google.com, kai.huang@intel.com, fan.du@intel.com, zhiquan1.li@intel.com Subject: [PATCH v3 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization Date: Wed, 25 May 2022 18:07:30 +0800 Message-Id: <20220525100730.760815-1-zhiquan1.li@intel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org When VM guest access a SGX EPC page with memory failure, current behavior will kill the guest, expected only kill the SGX application inside it. To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra information for hypervisor to inject #MC information to guest, which is helpful in SGX case. The rest of things are guest side. Currently the hypervisor like Qemu already has mature facility to convert HVA to GPA and inject #MC to the guest OS. Unlike host enclaves, virtual EPC instance cannot be shared by multiple VMs. It is because how enclaves are created is totally up to the guest. Sharing virtual EPC instance will be very likely to unexpectedly break enclaves in all VMs. SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance being shared by multiple VMs via fork(). However KVM doesn't support running a VM across multiple mm structures, and the de facto userspace hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice this should not happen. Signed-off-by: Zhiquan Li Acked-by: Kai Huang Link: https://lore.kernel.org/linux-sgx/443cb425-009c-2784-56f4-5e707122de76@intel.com/T/#m1d1f4098f4fad78034e8706a60e4d79c119db407 --- Changes since V2: - Retrieve virtual address from "owner" field of struct sgx_epc_page, instead of struct sgx_vepc_page. - Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with SGX_EPC_PAGE_KVM_GUEST as they are duplicated. Changes since V1: - Add Acked-by from Kai Huang. - Add Kai’s excellent explanation regarding to why we no need to consider that one virtual EPC be shared by two guests. --- arch/x86/kernel/cpu/sgx/main.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index ab4ec54bbdd9..faca7f73b06d 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -715,6 +715,8 @@ int arch_memory_failure(unsigned long pfn, int flags) struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); struct sgx_epc_section *section; struct sgx_numa_node *node; + int ret = 0; + unsigned long vaddr; /* * mm/memory-failure.c calls this routine for all errors @@ -731,8 +733,26 @@ int arch_memory_failure(unsigned long pfn, int flags) * error. The signal may help the task understand why the * enclave is broken. */ - if (flags & MF_ACTION_REQUIRED) - force_sig(SIGBUS); + if (flags & MF_ACTION_REQUIRED) { + /* + * Provide extra info to the task so that it can make further + * decision but not simply kill it. This is quite useful for + * virtualization case. + */ + if (page->flags & SGX_EPC_PAGE_KVM_GUEST) { + /* + * The "owner" field is repurposed as the virtual address + * of virtual EPC page. + */ + vaddr = (unsigned long)page->owner & PAGE_MASK; + ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr, + PAGE_SHIFT); + if (ret < 0) + pr_err("Memory failure: Error sending signal to %s:%d: %d\n", + current->comm, current->pid, ret); + } else + force_sig(SIGBUS); + } section = &sgx_epc_sections[page->section]; node = section->node; From patchwork Wed May 25 10:07:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhiquan Li X-Patchwork-Id: 12860943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C949C433F5 for ; Wed, 25 May 2022 10:07:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241630AbiEYKHm (ORCPT ); Wed, 25 May 2022 06:07:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242146AbiEYKHk (ORCPT ); Wed, 25 May 2022 06:07:40 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A034939F7 for ; Wed, 25 May 2022 03:07:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1653473227; x=1685009227; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=2T7wMCj0vBNOd+xYIeo5hX9DR1sX3Spod4zmEXVot5M=; b=OEot9vb1Y1Ns2sOPM+l4y2fKFVRMnVFMKRE4Y8jalBSSCSgApVrQ3IpQ Ru+XCxkT686MwnmJYRpvOwOau9bicFVgByIxmY9edXHv3EW//5sbjVQFD 5P7B4r4xOKHi4eIrXgqNs8AbJl/PxwHeHw/FQ5IXxnvTvu8NfaxKss1xN ucOThidhUtpKQi+sc+xugcCZ3kUuB2JzGCs1daoUcfwYMbDZBjtKRQdwg kXvQufq547lD0Oy+AN0yfQVmnJghmempiMh1qydFgm4036DogUdkKRiza 5NvAbnMbeD685MC5QmZZ+GLr5UObcxQHe/i9yf1HcW1FzRAOQoRHmPhnO A==; X-IronPort-AV: E=McAfee;i="6400,9594,10357"; a="273501150" X-IronPort-AV: E=Sophos;i="5.91,250,1647327600"; d="scan'208";a="273501150" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 May 2022 03:07:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,250,1647327600"; d="scan'208";a="717573084" Received: from zhiquan-linux-dev.bj.intel.com ([10.238.155.101]) by fmsmga001.fm.intel.com with ESMTP; 25 May 2022 03:07:04 -0700 From: Zhiquan Li To: linux-sgx@vger.kernel.org, tony.luck@intel.com Cc: jarkko@kernel.org, dave.hansen@linux.intel.com, seanjc@google.com, kai.huang@intel.com, fan.du@intel.com, zhiquan1.li@intel.com Subject: [PATCH v3 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case Date: Wed, 25 May 2022 18:07:49 +0800 Message-Id: <20220525100749.760864-1-zhiquan1.li@intel.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org When the application accesses a SGX EPC page with memory failure, the task will receive a SIGBUS signal without any extra info, unless the EPC page has SGX_EPC_PAGE_KVM_GUEST flag. However, in some cases, we only use SGX in sub-task and we don't expect the entire task group be killed due to a SGX EPC page for a sub-task has memory failure. To fix it, we extend the solution for normal case. That is, the SGX regular EPC page with memory failure will trigger a SIGBUS signal with code BUS_MCEERR_AR and additional info, so that the user has opportunity to make further decision. Suppose an enclave is shared by multiple processes, when an enclave page triggers a machine check, the enclave will be disabled so that it couldn't be entered again. Killing other processes with the same enclave mapped would perhaps be overkill, but they are going to find that the enclave is "dead" next time they try to use it. Thanks for Jarkko's head up and Tony's clarification on this point. Our intension is to provide additional info so that the application has more choices. Current behavior looks gently, and we don't want to change it. Signed-off-by: Zhiquan Li --- Changes since V2: - Adapted the code since struct sgx_vepc_page was discarded. - Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with SGX_EPC_PAGE_KVM_GUEST as they are duplicated. Changes since V1: - Add valuable information from Jarkko and Tony the into commit message. --- arch/x86/kernel/cpu/sgx/main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index faca7f73b06d..69a2a29c8957 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -739,12 +739,15 @@ int arch_memory_failure(unsigned long pfn, int flags) * decision but not simply kill it. This is quite useful for * virtualization case. */ - if (page->flags & SGX_EPC_PAGE_KVM_GUEST) { + if (page->owner) { /* * The "owner" field is repurposed as the virtual address * of virtual EPC page. */ - vaddr = (unsigned long)page->owner & PAGE_MASK; + if (page->flags & SGX_EPC_PAGE_KVM_GUEST) + vaddr = (unsigned long)page->owner & PAGE_MASK; + else + vaddr = (unsigned long)page->owner->desc & PAGE_MASK; ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr, PAGE_SHIFT); if (ret < 0)