diff mbox series

[v5,3/3] x86/sgx: Fine grained SGX MCA behavior for normal case

Message ID 20220622093705.2891642-4-zhiquan1.li@intel.com (mailing list archive)
State New, archived
Headers show
Series x86/sgx: fine grained SGX MCA behavior | expand

Commit Message

Zhiquan Li June 22, 2022, 9:37 a.m. UTC
When the application accesses a SGX EPC page with memory failure, the
task will receive a SIGBUS signal without any extra info, unless the
EPC page has SGX_EPC_PAGE_KVM_GUEST flag. However, in some cases,
we only use SGX in sub-task and we don't expect the entire task group
be killed due to a SGX EPC page for a sub-task has memory failure.

To fix it, we extend the solution for normal case. That is, the SGX
regular EPC page with memory failure will trigger a SIGBUS signal with
code BUS_MCEERR_AR and additional info, so that the user has opportunity
to make further decision.

Suppose an enclave is shared by multiple processes, when an enclave page
triggers a machine check, the enclave will be disabled so that it
couldn't be entered again. Killing other processes with the same enclave
mapped would perhaps be overkill, but they are going to find that the
enclave is "dead" next time they try to use it. Thanks for Jarkko's head
up and Tony's clarification on this point.

Our intension is to provide additional info so that the application has
more choices. Current behavior looks gently, and we don't want to change
it.

Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>
---
No changes since V4.

Changes since V2:
- Adapted the code since struct sgx_vepc_page was discarded.
- Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with
  SGX_EPC_PAGE_KVM_GUEST as they are duplicated.

Changes since V1:
- Add valuable information from Jarkko and Tony the into commit
  message.
---
 arch/x86/kernel/cpu/sgx/main.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Dave Hansen July 21, 2022, 4:57 p.m. UTC | #1
On 6/22/22 02:37, Zhiquan Li wrote:
> When the application accesses a SGX EPC page with memory failure, the
> task will receive a SIGBUS signal without any extra info, unless the
> EPC page has SGX_EPC_PAGE_KVM_GUEST flag. However, in some cases,
> we only use SGX in sub-task and we don't expect the entire task group
> be killed due to a SGX EPC page for a sub-task has memory failure.
> 
> To fix it, we extend the solution for normal case. That is, the SGX
> regular EPC page with memory failure will trigger a SIGBUS signal with
> code BUS_MCEERR_AR and additional info, so that the user has opportunity
> to make further decision.
> 
> Suppose an enclave is shared by multiple processes, when an enclave page
> triggers a machine check, the enclave will be disabled so that it
> couldn't be entered again. Killing other processes with the same enclave
> mapped would perhaps be overkill, but they are going to find that the
> enclave is "dead" next time they try to use it. Thanks for Jarkko's head
> up and Tony's clarification on this point.
> 
> Our intension is to provide additional info so that the application has
> more choices. Current behavior looks gently, and we don't want to change
> it.
> 
> Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>

I honestly have zero idea what this patch is doing.
Zhiquan Li July 22, 2022, 5:28 p.m. UTC | #2
On 2022/7/22 00:57, Dave Hansen wrote:
> I honestly have zero idea what this patch is doing.

OK, let's drop it unless we have more proper reason in future.

Best Regards,
Zhiquan
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 4507c2302348..7c55dcdb2b7c 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -739,12 +739,15 @@  int arch_memory_failure(unsigned long pfn, int flags)
 		 * decision but not simply kill it. This is quite useful for
 		 * virtualization case.
 		 */
-		if (page->flags & SGX_EPC_PAGE_KVM_GUEST) {
+		if (page->owner) {
 			/*
 			 * The "owner" field is repurposed as the virtual address
 			 * of virtual EPC page.
 			 */
-			vaddr = (unsigned long)page->owner & PAGE_MASK;
+			if (page->flags & SGX_EPC_PAGE_KVM_GUEST)
+				vaddr = (unsigned long)page->owner & PAGE_MASK;
+			else
+				vaddr = (unsigned long)page->owner->desc & PAGE_MASK;
 			ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr,
 					PAGE_SHIFT);
 			if (ret < 0)