From patchwork Thu Jul 11 14:25:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040181 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8FB836C5 for ; Thu, 11 Jul 2019 14:30:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8113C28AE1 for ; Thu, 11 Jul 2019 14:30:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 748CD28AE6; Thu, 11 Jul 2019 14:30:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB24C28AE1 for ; Thu, 11 Jul 2019 14:30:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728689AbfGKO1b (ORCPT ); Thu, 11 Jul 2019 10:27:31 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:39134 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728596AbfGKO13 (ORCPT ); Thu, 11 Jul 2019 10:27:29 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOEWc100497; Thu, 11 Jul 2019 14:25:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=01fwbgJFKdvEoWu4pwKWewOW1PRaOpulx/v99NC5AeE=; b=OUZHm+j4S+p6/v1st7v0l2XFGb9Yzt9tSB6OEwidnVH8C7ZbftVmyTxkNolkQOvSzhgY cbHKPq4ORwMiLSvoraLOmO32yfqW3gASglaC7FLB9Ef9R707DAjk/G6dzWZVUMkGyvAu iay9bQQTO3wg3AefHAS5a13R3q6omWxKLrNSwhL0s5bdk/AqX+ms0qxcg6eBJqXRKS+p 1w7YM5+MF3gpt84KMBf39eDBip4O9sr14o2Z9tFI78xInQo9JS7l178MkVnONAP6qWu2 2HBBJ2jtAhFVr8aFlUUKhi4FcgaY2f8JyFXyHWl81HeHbgMzGEK96qXqtVniyPis1l+7 jw== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0c5k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:25:50 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcts021444; Thu, 11 Jul 2019 14:25:47 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 01/26] mm/x86: Introduce kernel address space isolation Date: Thu, 11 Jul 2019 16:25:13 +0200 Message-Id: <1562855138-19507-2-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Introduce core functions and structures for implementing Address Space Isolation (ASI). Kernel address space isolation provides the ability to run some kernel code with a reduced kernel address space. An address space isolation is defined with a struct asi structure which has its own page-table. While, for now, this page-table is empty, it will eventually be possible to populate it so that it is much smaller than the full kernel page-table. Isolation is entered by calling asi_enter() which switches the kernel page-table to the address space isolation page-table. Isolation is then exited by calling asi_exit() which switches the page-table back to the kernel page-table. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 41 ++++++++++++ arch/x86/mm/Makefile | 2 + arch/x86/mm/asi.c | 152 ++++++++++++++++++++++++++++++++++++++++++++ security/Kconfig | 10 +++ 4 files changed, 205 insertions(+), 0 deletions(-) create mode 100644 arch/x86/include/asm/asi.h create mode 100644 arch/x86/mm/asi.c diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h new file mode 100644 index 0000000..8a13f73 --- /dev/null +++ b/arch/x86/include/asm/asi.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef ARCH_X86_MM_ASI_H +#define ARCH_X86_MM_ASI_H + +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + +#include +#include + +struct asi { + spinlock_t lock; /* protect all attributes */ + pgd_t *pgd; /* ASI page-table */ +}; + +/* + * An ASI session maintains the state of address state isolation on a + * cpu. There is one ASI session per cpu. There is no lock to protect + * members of the asi_session structure as each cpu is managing its + * own ASI session. + */ + +enum asi_session_state { + ASI_SESSION_STATE_INACTIVE, /* no address space isolation */ + ASI_SESSION_STATE_ACTIVE, /* address space isolation is active */ +}; + +struct asi_session { + struct asi *asi; /* ASI for this session */ + enum asi_session_state state; /* state of ASI session */ + unsigned long original_cr3; /* cr3 before entering ASI */ + struct task_struct *task; /* task during isolation */ +} __aligned(PAGE_SIZE); + +extern struct asi *asi_create(void); +extern void asi_destroy(struct asi *asi); +extern int asi_enter(struct asi *asi); +extern void asi_exit(struct asi *asi); + +#endif /* CONFIG_ADDRESS_SPACE_ISOLATION */ + +#endif diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 84373dc..dae5c8a 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -49,7 +49,9 @@ obj-$(CONFIG_X86_INTEL_MPX) += mpx.o obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o +obj-$(CONFIG_ADDRESS_SPACE_ISOLATION) += asi.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o + diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c new file mode 100644 index 0000000..c3993b7 --- /dev/null +++ b/arch/x86/mm/asi.c @@ -0,0 +1,152 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved. + * + * Kernel Address Space Isolation (ASI) + */ + +#include +#include +#include +#include +#include + +#include +#include +#include + +/* ASI sessions, one per cpu */ +DEFINE_PER_CPU_PAGE_ALIGNED(struct asi_session, cpu_asi_session); + +static int asi_init_mapping(struct asi *asi) +{ + /* + * TODO: Populate the ASI page-table with minimal mappings so + * that we can at least enter isolation and abort. + */ + return 0; +} + +struct asi *asi_create(void) +{ + struct page *page; + struct asi *asi; + int err; + + asi = kzalloc(sizeof(*asi), GFP_KERNEL); + if (!asi) + return NULL; + + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); + if (!page) + goto error; + + asi->pgd = page_address(page); + spin_lock_init(&asi->lock); + + err = asi_init_mapping(asi); + if (err) + goto error; + + return asi; + +error: + asi_destroy(asi); + return NULL; +} +EXPORT_SYMBOL(asi_create); + +void asi_destroy(struct asi *asi) +{ + if (!asi) + return; + + if (asi->pgd) + free_page((unsigned long)asi->pgd); + + kfree(asi); +} +EXPORT_SYMBOL(asi_destroy); + + +/* + * When isolation is active, the address space doesn't necessarily map + * the percpu offset value (this_cpu_off) which is used to get pointers + * to percpu variables. So functions which can be invoked while isolation + * is active shouldn't be getting pointers to percpu variables (i.e. with + * get_cpu_var() or this_cpu_ptr()). Instead percpu variable should be + * directly read or written to (i.e. with this_cpu_read() or + * this_cpu_write()). + */ + +int asi_enter(struct asi *asi) +{ + enum asi_session_state state; + struct asi *current_asi; + struct asi_session *asi_session; + + state = this_cpu_read(cpu_asi_session.state); + /* + * We can re-enter isolation, but only with the same ASI (we don't + * support nesting isolation). Also, if isolation is still active, + * then we should be re-entering with the same task. + */ + if (state == ASI_SESSION_STATE_ACTIVE) { + current_asi = this_cpu_read(cpu_asi_session.asi); + if (current_asi != asi) { + WARN_ON(1); + return -EBUSY; + } + WARN_ON(this_cpu_read(cpu_asi_session.task) != current); + return 0; + } + + /* isolation is not active so we can safely access the percpu pointer */ + asi_session = &get_cpu_var(cpu_asi_session); + asi_session->asi = asi; + asi_session->task = current; + asi_session->original_cr3 = __get_current_cr3_fast(); + if (!asi_session->original_cr3) { + WARN_ON(1); + err = -EINVAL; + goto err_clear_asi; + } + asi_session->state = ASI_SESSION_STATE_ACTIVE; + + load_cr3(asi->pgd); + + return 0; + +err_clear_asi: + asi_session->asi = NULL; + asi_session->task = NULL; + + return err; + +} +EXPORT_SYMBOL(asi_enter); + +void asi_exit(struct asi *asi) +{ + struct asi_session *asi_session; + enum asi_session_state asi_state; + unsigned long original_cr3; + + asi_state = this_cpu_read(cpu_asi_session.state); + if (asi_state == ASI_SESSION_STATE_INACTIVE) + return; + + /* TODO: Kick sibling hyperthread before switching to kernel cr3 */ + original_cr3 = this_cpu_read(cpu_asi_session.original_cr3); + if (original_cr3) + write_cr3(original_cr3); + + /* page-table was switched, we can now access the percpu pointer */ + asi_session = &get_cpu_var(cpu_asi_session); + WARN_ON(asi_session->task != current); + asi_session->state = ASI_SESSION_STATE_INACTIVE; + asi_session->asi = NULL; + asi_session->task = NULL; + asi_session->original_cr3 = 0; +} +EXPORT_SYMBOL(asi_exit); diff --git a/security/Kconfig b/security/Kconfig index 466cc1f..241b9a7 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -65,6 +65,16 @@ config PAGE_TABLE_ISOLATION See Documentation/x86/pti.txt for more details. +config ADDRESS_SPACE_ISOLATION + bool "Allow code to run with a reduced kernel address space" + default y + depends on (X86_64 || X86_PAE) && !UML + help + This feature provides the ability to run some kernel code + with a reduced kernel address space. This can be used to + mitigate speculative execution attacks which are able to + leak data between sibling CPU hyper-threads. + config SECURITY_INFINIBAND bool "Infiniband Security Hooks" depends on SECURITY && INFINIBAND From patchwork Thu Jul 11 14:25:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B911B1395 for ; Thu, 11 Jul 2019 14:29:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A838228ABE for ; Thu, 11 Jul 2019 14:29:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9C03828AD1; Thu, 11 Jul 2019 14:29:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 93AFD28AC8 for ; Thu, 11 Jul 2019 14:29:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728884AbfGKO2L (ORCPT ); Thu, 11 Jul 2019 10:28:11 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:39866 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728835AbfGKO2K (ORCPT ); Thu, 11 Jul 2019 10:28:10 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOGvV100511; Thu, 11 Jul 2019 14:25:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=eEi0ZTZbIEZ47JResTnFegGOS+nY5vwRKMSKjoZA8rw=; b=doVQHd64if/2kjiu3iJcq23b/ysNGaO3zFQ1Fbt9B8gbpk5AqZTdAGLfrkqQEQ1t29L7 vIj+rjQt4iariplk3zmGOGhNmniQGZlOnZRXIEVu/XP2NILI1mTBKcs9y9sxpXNe7zLm WTpN/PNi/5Jc1UHjSLknTN0GAMM8aeWagwwQaQOKeG6DbEY5KTVF1BDM73GoF0OnyPfM bujoi6mhv9a226crDPB4jhcZ5qwIaENDIb5Eyq65a1CYJBrVBQCE3nFbdPTH6gGouBW2 4zVkdrvRe3v52Mrj+Eaw7ZDk5eDA1G0rOGRsj+izv+9ASQxDllLZRPrfz0ej8u3KokCM vA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0c69-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:25:58 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPctt021444; Thu, 11 Jul 2019 14:25:50 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 02/26] mm/asi: Abort isolation on interrupt, exception and context switch Date: Thu, 11 Jul 2019 16:25:14 +0200 Message-Id: <1562855138-19507-3-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Address space isolation should be aborted if there is an interrupt, an exception or a context switch. Interrupt/exception handlers and context switch code need to run with the full kernel address space. Address space isolation is aborted by restoring the original CR3 value used before entering address space isolation. Signed-off-by: Alexandre Chartre --- arch/x86/entry/entry_64.S | 42 ++++++++++- arch/x86/include/asm/asi.h | 114 ++++++++++++++++++++++++++++ arch/x86/kernel/asm-offsets.c | 4 + arch/x86/mm/asi.c | 165 ++++++++++++++++++++++++++++++++++++++--- kernel/sched/core.c | 4 + 5 files changed, 315 insertions(+), 14 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 11aa3b2..3dc6174 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -38,6 +38,7 @@ #include #include #include +#include #include #include "calling.h" @@ -558,8 +559,15 @@ ENTRY(interrupt_entry) TRACE_IRQS_OFF CALL_enter_from_user_mode - +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + jmp 2f +#endif 1: +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + /* Abort address space isolation if it is active */ + ASI_START_ABORT +2: +#endif ENTER_IRQ_STACK old_rsp=%rdi save_ret=1 /* We entered an interrupt context - irqs are off: */ TRACE_IRQS_OFF @@ -583,6 +591,9 @@ common_interrupt: call do_IRQ /* rdi points to pt_regs */ /* 0(%rsp): old RSP */ ret_from_intr: +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + ASI_FINISH_ABORT +#endif DISABLE_INTERRUPTS(CLBR_ANY) TRACE_IRQS_OFF @@ -947,6 +958,9 @@ ENTRY(\sym) addq $\ist_offset, CPU_TSS_IST(\shift_ist) .endif +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + ASI_FINISH_ABORT +#endif /* these procedures expect "no swapgs" flag in ebx */ .if \paranoid jmp paranoid_exit @@ -1182,6 +1196,16 @@ ENTRY(paranoid_entry) xorl %ebx, %ebx 1: +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + /* + * If address space isolation is active then abort it and return + * the original kernel CR3 in %r14. + */ + ASI_START_ABORT_ELSE_JUMP 2f + movq %rdi, %r14 + ret +2: +#endif /* * Always stash CR3 in %r14. This value will be restored, * verbatim, at exit. Needed if paranoid_entry interrupted @@ -1265,6 +1289,15 @@ ENTRY(error_entry) CALL_enter_from_user_mode ret +.Lerror_entry_check_address_space_isolation: +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + /* + * Abort address space isolation if it is active. This will restore + * the original kernel CR3. + */ + ASI_START_ABORT +#endif + .Lerror_entry_done: TRACE_IRQS_OFF ret @@ -1283,7 +1316,7 @@ ENTRY(error_entry) cmpq %rax, RIP+8(%rsp) je .Lbstep_iret cmpq $.Lgs_change, RIP+8(%rsp) - jne .Lerror_entry_done + jne .Lerror_entry_check_address_space_isolation /* * hack: .Lgs_change can fail with user gsbase. If this happens, fix up @@ -1632,7 +1665,10 @@ end_repeat_nmi: movq %rsp, %rdi movq $-1, %rsi call do_nmi - + +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + ASI_FINISH_ABORT +#endif /* Always restore stashed CR3 value (see paranoid_entry) */ RESTORE_CR3 scratch_reg=%r15 save_reg=%r14 diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 8a13f73..ff126e1 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -4,6 +4,8 @@ #ifdef CONFIG_ADDRESS_SPACE_ISOLATION +#ifndef __ASSEMBLY__ + #include #include @@ -22,20 +24,132 @@ struct asi { enum asi_session_state { ASI_SESSION_STATE_INACTIVE, /* no address space isolation */ ASI_SESSION_STATE_ACTIVE, /* address space isolation is active */ + ASI_SESSION_STATE_ABORTED, /* isolation has been aborted */ }; struct asi_session { struct asi *asi; /* ASI for this session */ enum asi_session_state state; /* state of ASI session */ + bool retry_abort; /* always retry abort */ + unsigned int abort_depth; /* abort depth */ unsigned long original_cr3; /* cr3 before entering ASI */ struct task_struct *task; /* task during isolation */ } __aligned(PAGE_SIZE); +DECLARE_PER_CPU_PAGE_ALIGNED(struct asi_session, cpu_asi_session); + extern struct asi *asi_create(void); extern void asi_destroy(struct asi *asi); extern int asi_enter(struct asi *asi); extern void asi_exit(struct asi *asi); +/* + * Function to exit the current isolation. This is used to abort isolation + * when a task using isolation is scheduled out. + */ +static inline void asi_abort(void) +{ + enum asi_session_state asi_state; + + asi_state = this_cpu_read(cpu_asi_session.state); + if (asi_state == ASI_SESSION_STATE_INACTIVE) + return; + + asi_exit(this_cpu_read(cpu_asi_session.asi)); +} + +/* + * Barriers for code which sets CR3 to use the ASI page-table. That's + * the case, for example, when entering isolation, or during a VMExit if + * isolation was active. If such a code is interrupted before CR3 is + * effectively set, then the interrupt will abort isolation and restore + * the original CR3 value. But then, the code will sets CR3 to use the + * ASI page-table while isolation has been aborted by the interrupt. + * + * To prevent this issue, such a code should call asi_barrier_begin() + * before CR3 gets updated, and asi_barrier_end() after CR3 has been + * updated. + * + * asi_barrier_begin() will set retry_abort to true. This will force + * interrupts to retain the isolation abort state. Then, after the code + * has updated CR3, asi_barrier_end() will be able to check if isolation + * was aborted and effectively abort isolation in that case. Setting + * retry_abort to true will also force all interrupt to restore the + * original CR3; that's in case we have interrupts both before and + * after CR3 is set. + */ +static inline unsigned long asi_restore_cr3(void) +{ + unsigned long original_cr3; + + /* TODO: Kick sibling hyperthread before switching to kernel cr3 */ + original_cr3 = this_cpu_read(cpu_asi_session.original_cr3); + if (original_cr3) + write_cr3(original_cr3); + + return original_cr3; +} + +static inline void asi_barrier_begin(void) +{ + this_cpu_write(cpu_asi_session.retry_abort, true); + mb(); +} + +static inline void asi_barrier_end(void) +{ + enum asi_session_state state; + + this_cpu_write(cpu_asi_session.retry_abort, false); + mb(); + state = this_cpu_read(cpu_asi_session.state); + if (state == ASI_SESSION_STATE_ABORTED) { + (void) asi_restore_cr3(); + asi_abort(); + return; + } + +} + +#else /* __ASSEMBLY__ */ + +/* + * If address space isolation is active, start aborting isolation. + */ +.macro ASI_START_ABORT + movl PER_CPU_VAR(cpu_asi_session + CPU_ASI_SESSION_state), %edi + testl %edi, %edi + jz .Lasi_start_abort_done_\@ + call asi_start_abort +.Lasi_start_abort_done_\@: +.endm + +/* + * If address space isolation is active, finish aborting isolation. + */ +.macro ASI_FINISH_ABORT + movl PER_CPU_VAR(cpu_asi_session + CPU_ASI_SESSION_state), %edi + testl %edi, %edi + jz .Lasi_finish_abort_done_\@ + call asi_finish_abort +.Lasi_finish_abort_done_\@: +.endm + +/* + * If address space isolation is inactive then jump to the specified + * label. Otherwise, start aborting isolation. + */ +.macro ASI_START_ABORT_ELSE_JUMP asi_inactive_label:req + movl PER_CPU_VAR(cpu_asi_session + CPU_ASI_SESSION_state), %edi + testl %edi, %edi + jz \asi_inactive_label + call asi_start_abort + testq %rdi, %rdi + jz \asi_inactive_label +.endm + +#endif /* __ASSEMBLY__ */ + #endif /* CONFIG_ADDRESS_SPACE_ISOLATION */ #endif diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c index 168543d..395d0c6 100644 --- a/arch/x86/kernel/asm-offsets.c +++ b/arch/x86/kernel/asm-offsets.c @@ -18,6 +18,7 @@ #include #include #include +#include #ifdef CONFIG_XEN #include @@ -105,4 +106,7 @@ static void __used common(void) OFFSET(TSS_sp0, tss_struct, x86_tss.sp0); OFFSET(TSS_sp1, tss_struct, x86_tss.sp1); OFFSET(TSS_sp2, tss_struct, x86_tss.sp2); + + BLANK(); + OFFSET(CPU_ASI_SESSION_state, asi_session, state); } diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index c3993b7..fabb923 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -84,9 +84,17 @@ int asi_enter(struct asi *asi) enum asi_session_state state; struct asi *current_asi; struct asi_session *asi_session; + unsigned long original_cr3; state = this_cpu_read(cpu_asi_session.state); /* + * The "aborted" state is a transient state used in interrupt and + * exception handlers while aborting isolation. So it shouldn't be + * set when entering isolation. + */ + WARN_ON(state == ASI_SESSION_STATE_ABORTED); + + /* * We can re-enter isolation, but only with the same ASI (we don't * support nesting isolation). Also, if isolation is still active, * then we should be re-entering with the same task. @@ -105,15 +113,44 @@ int asi_enter(struct asi *asi) asi_session = &get_cpu_var(cpu_asi_session); asi_session->asi = asi; asi_session->task = current; - asi_session->original_cr3 = __get_current_cr3_fast(); - if (!asi_session->original_cr3) { + WARN_ON(asi_session->abort_depth > 0); + + /* + * Instructions ordering is important here because we should be + * able to deal with any interrupt/exception which will abort + * the isolation and restore CR3 to its original value: + * + * - asi_session->original_cr3 must be set before the ASI session + * becomes active (i.e. before setting asi_session->state to + * ASI_SESSION_STATE_ACTIVE); + * - the ASI session must be marked as active (i.e. set + * asi_session->state to ASI_SESSION_STATE_ACTIVE) before + * loading the CR3 used during isolation. + * + * Any exception or interrupt occurring after asi_session->state is + * set to ASI_SESSION_STATE_ACTIVE will cause the exception/interrupt + * handler to abort the isolation. The handler will then restore + * cr3 to asi_session->original_cr3 and move asi_session->state to + * ASI_SESSION_STATE_ABORTED. + */ + original_cr3 = __get_current_cr3_fast(); + if (!original_cr3) { WARN_ON(1); err = -EINVAL; goto err_clear_asi; } - asi_session->state = ASI_SESSION_STATE_ACTIVE; + asi_session->original_cr3 = original_cr3; + /* + * Use ASI barrier as we are setting CR3 with the ASI page-table. + * The barrier should begin before setting the state to active as + * any interrupt after the state is active will abort isolation. + */ + asi_barrier_begin(); + asi_session->state = ASI_SESSION_STATE_ACTIVE; + mb(); load_cr3(asi->pgd); + asi_barrier_end(); return 0; @@ -130,23 +167,129 @@ void asi_exit(struct asi *asi) { struct asi_session *asi_session; enum asi_session_state asi_state; - unsigned long original_cr3; asi_state = this_cpu_read(cpu_asi_session.state); - if (asi_state == ASI_SESSION_STATE_INACTIVE) + switch (asi_state) { + case ASI_SESSION_STATE_INACTIVE: return; - - /* TODO: Kick sibling hyperthread before switching to kernel cr3 */ - original_cr3 = this_cpu_read(cpu_asi_session.original_cr3); - if (original_cr3) - write_cr3(original_cr3); + case ASI_SESSION_STATE_ACTIVE: + (void) asi_restore_cr3(); + break; + case ASI_SESSION_STATE_ABORTED: + /* + * No need to restore cr3, this was already done during + * the isolation abort. + */ + break; + } /* page-table was switched, we can now access the percpu pointer */ asi_session = &get_cpu_var(cpu_asi_session); - WARN_ON(asi_session->task != current); + /* + * asi_exit() can be interrupted before setting the state to + * ASI_SESSION_STATE_INACTIVE. In that case, the interrupt will + * exit isolation before we have started the actual exit. So + * check that the session ASI is still set to verify that an + * exit hasn't already be done. + */ asi_session->state = ASI_SESSION_STATE_INACTIVE; + mb(); + if (asi_session->asi == NULL) { + /* exit was already done */ + return; + } + WARN_ON(asi_session->retry_abort); + WARN_ON(asi_session->task != current); asi_session->asi = NULL; asi_session->task = NULL; asi_session->original_cr3 = 0; + + /* + * Reset abort_depth because some interrupt/exception handlers + * (like the user page-fault handler) can schedule us out and so + * exit isolation before abort_depth reaches 0. + */ + asi_session->abort_depth = 0; } EXPORT_SYMBOL(asi_exit); + +/* + * Functions to abort isolation. When address space isolation is active, + * these functions are used by interrupt/exception handlers to abort + * isolation. + * + * Common Case + * ----------- + * asi_start_abort() is invoked at the beginning of the interrupt/exception + * handler. It aborts isolation by restoring the original CR3 value, + * increments the abort count, and move the isolation state to "aborted" + * (ASI_SESSION_STATE_ABORTED). If the interrupt/exception is interrupted + * by another interrupt/exception then the new interrupt/exception will + * just increment the abort count. + * + * asi_finish_abort() is invoked at the end of the interrupt/exception + * handler. It decrements is abort count and if that count reaches zero + * then it invokes asi_exit() to exit isolation. + * + * Special Case When Entering Isolation + * ------------------------------------ + * When entering isolation, asi_enter() will set cpu_asi_session.retry_abort + * while updating CR3 to the ASI page-table. This forces asi_start_abort() + * handlers to abort isolation even if isolation was already aborted. Also + * asi_finish_abort() will retain the aborted state and not exit isolation + * (no call to asi_exit()). + */ +unsigned long asi_start_abort(void) +{ + enum asi_session_state state; + unsigned long original_cr3; + + state = this_cpu_read(cpu_asi_session.state); + + switch (state) { + + case ASI_SESSION_STATE_INACTIVE: + return 0; + + case ASI_SESSION_STATE_ACTIVE: + original_cr3 = asi_restore_cr3(); + this_cpu_write(cpu_asi_session.state, + ASI_SESSION_STATE_ABORTED); + break; + + case ASI_SESSION_STATE_ABORTED: + /* + * In the normal case, if the session was already aborted + * then CR3 has already been restored. However if retry_abort + * is set then we restore CR3 again. + */ + if (this_cpu_read(cpu_asi_session.retry_abort)) + original_cr3 = asi_restore_cr3(); + else + original_cr3 = this_cpu_read( + cpu_asi_session.original_cr3); + break; + } + + this_cpu_inc(cpu_asi_session.abort_depth); + + return original_cr3; +} + +void asi_finish_abort(void) +{ + enum asi_session_state state; + + state = this_cpu_read(cpu_asi_session.state); + if (state == ASI_SESSION_STATE_INACTIVE) + return; + + WARN_ON(state != ASI_SESSION_STATE_ABORTED); + + /* if retry_abort is set then we retain the abort state */ + if (this_cpu_dec_return(cpu_asi_session.abort_depth) > 0 || + this_cpu_read(cpu_asi_session.retry_abort)) + return; + + asi_exit(this_cpu_read(cpu_asi_session.asi)); +} diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 874c427..bb363f3 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -14,6 +14,7 @@ #include #include +#include #include "../workqueue_internal.h" #include "../smpboot.h" @@ -2597,6 +2598,9 @@ static inline void finish_lock_switch(struct rq *rq) prepare_task_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next) { +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + asi_abort(); +#endif kcov_prepare_switch(prev); sched_info_switch(rq, prev, next); perf_event_task_sched_out(prev, next); From patchwork Thu Jul 11 14:25:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D4BA6C5 for ; Thu, 11 Jul 2019 14:30:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D6D628AE1 for ; Thu, 11 Jul 2019 14:30:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 518DD28AE3; Thu, 11 Jul 2019 14:30:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C938428AE6 for ; Thu, 11 Jul 2019 14:30:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728655AbfGKO12 (ORCPT ); Thu, 11 Jul 2019 10:27:28 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:36792 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728553AbfGKO12 (ORCPT ); Thu, 11 Jul 2019 10:27:28 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO9M4001474; Thu, 11 Jul 2019 14:26:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=N1S18AguseqvsteFhWr/8Dc0t6ablZh92+dI5Teb2M0=; b=aEmCnnwO0ArceoX+05Si1M+AHvh5m5JJ1bSo9UUo9F7KLd7OqzZym5b7ZPYuJjAW3g4s Gb7xZ3/STbdj4kYw8hKG/DtOtrLsatWwPXpImxRc8wftUMiHP7PR1oUssKSt8pbPiUFk nOqjHRqOXAb4/fQSVgHDj2BF7mvKm78/5UcHY8N02qZKTsjiTAfY8DQq2Lf8iHYOueTq x/h6EnEuXbJlxgYKOboLBJWJ/+nQMtliAUEUPoaDXOXG/UfKOlKbOCOAR2+mYE6dul7p hiUvWM7MnEfAJk+hfdeMiPyCJVFvALeGy7d/JxN9SdvyplS80ju1Ttaf/csNEpaxSMA5 6Q== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0dw7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:02 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPctu021444; Thu, 11 Jul 2019 14:25:53 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 03/26] mm/asi: Handle page fault due to address space isolation Date: Thu, 11 Jul 2019 16:25:15 +0200 Message-Id: <1562855138-19507-4-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When address space isolation is active, kernel page faults can occur because data are not mapped in the ASI page-table. In such a case, log information about the fault and report the page fault as handled. As the page fault handler (like any exception handler) aborts isolation and switch back to the full kernel page-table, the faulty instruction will be retried using the full kernel address space. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 7 ++++ arch/x86/mm/asi.c | 68 ++++++++++++++++++++++++++++++++++++++++++++ arch/x86/mm/fault.c | 7 ++++ 3 files changed, 82 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index ff126e1..013d77a 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -9,9 +9,14 @@ #include #include +#define ASI_FAULT_LOG_SIZE 128 + struct asi { spinlock_t lock; /* protect all attributes */ pgd_t *pgd; /* ASI page-table */ + spinlock_t fault_lock; /* protect fault_log */ + unsigned long fault_log[ASI_FAULT_LOG_SIZE]; + bool fault_stack; /* display stack of fault? */ }; /* @@ -42,6 +47,8 @@ struct asi_session { extern void asi_destroy(struct asi *asi); extern int asi_enter(struct asi *asi); extern void asi_exit(struct asi *asi); +extern bool asi_fault(struct pt_regs *regs, unsigned long error_code, + unsigned long address); /* * Function to exit the current isolation. This is used to abort isolation diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index fabb923..717160d 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -18,6 +19,72 @@ /* ASI sessions, one per cpu */ DEFINE_PER_CPU_PAGE_ALIGNED(struct asi_session, cpu_asi_session); +static void asi_log_fault(struct asi *asi, struct pt_regs *regs, + unsigned long error_code, unsigned long address) +{ + int i = 0; + + /* + * Log information about the fault only if this is a fault + * we don't know about yet (and the fault log is not full). + */ + spin_lock(&asi->fault_lock); + for (i = 0; i < ASI_FAULT_LOG_SIZE; i++) { + if (asi->fault_log[i] == regs->ip) { + spin_unlock(&asi->fault_lock); + return; + } + if (!asi->fault_log[i]) { + asi->fault_log[i] = regs->ip; + break; + } + } + spin_unlock(&asi->fault_lock); + + if (i >= ASI_FAULT_LOG_SIZE) + pr_warn("ASI %p: fault log buffer is full [%d]\n", asi, i); + + pr_info("ASI %p: PF#%d (%ld) at %pS on %px\n", asi, i, + error_code, (void *)regs->ip, (void *)address); + + if (asi->fault_stack) + show_stack(NULL, (unsigned long *)regs->sp); +} + +bool asi_fault(struct pt_regs *regs, unsigned long error_code, + unsigned long address) +{ + struct asi_session *asi_session; + + /* + * If address space isolation was active when the fault occurred + * then the page fault handler has already aborted the isolation + * (exception handlers abort isolation very early) and switched + * CR3 back to its original value. + */ + + /* + * If address space isolation is not active, or we have a fault + * after isolation was aborted then this is a regular kernel fault, + * and we don't handle it. + */ + asi_session = &get_cpu_var(cpu_asi_session); + if (asi_session->state == ASI_SESSION_STATE_INACTIVE) + return false; + + WARN_ON(asi_session->state != ASI_SESSION_STATE_ABORTED); + WARN_ON(asi_session->abort_depth != 1); + + /* + * We have a fault while the cpu is using address space isolation. + * Log the fault and report that we have handled fault. This way, + * the faulty instruction will be retried with no isolation. + * + */ + asi_log_fault(asi_session->asi, regs, error_code, address); + return true; +} + static int asi_init_mapping(struct asi *asi) { /* @@ -43,6 +110,7 @@ struct asi *asi_create(void) asi->pgd = page_address(page); spin_lock_init(&asi->lock); + spin_lock_init(&asi->fault_lock); err = asi_init_mapping(asi); if (err) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 46df4c6..a405c43 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -29,6 +29,7 @@ #include /* efi_recover_from_page_fault()*/ #include /* store_idt(), ... */ #include /* exception stack */ +#include /* asi_fault() */ #define CREATE_TRACE_POINTS #include @@ -1252,6 +1253,12 @@ static int fault_in_kernel_space(unsigned long address) */ WARN_ON_ONCE(hw_error_code & X86_PF_PK); +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + /* Check if the fault occurs with address space isolation */ + if (asi_fault(regs, hw_error_code, address)) + return; +#endif + /* * We can fault-in kernel-space virtual memory on-demand. The * 'reference' page table is init_mm.pgd. From patchwork Thu Jul 11 14:25:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040179 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 20FBA112C for ; Thu, 11 Jul 2019 14:30:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10BEB28ABE for ; Thu, 11 Jul 2019 14:30:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 045EC28AE0; Thu, 11 Jul 2019 14:30:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D9D328ABE for ; Thu, 11 Jul 2019 14:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728886AbfGKOaC (ORCPT ); Thu, 11 Jul 2019 10:30:02 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:36882 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728712AbfGKO1d (ORCPT ); Thu, 11 Jul 2019 10:27:33 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOgjM001960; Thu, 11 Jul 2019 14:26:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=SJbrUhc02AgBQXRxdyh8NMsUj+oqXOnPmIbZ5Fc5Oh8=; b=xzFz4q2m+w7TU5mOs3ALiXB2uWjQppudBO654kJwJ3xcjzMIDadZ8a8b6jz4YyHWy1y3 WQjbtVLYHkN/sQca6bf2zIB7mNhCTUasiGxRSTnzAv+FRaXEzVfw9Y+TtQO4hmByjCup HS11ZsQytlUEzwToD8/1Qka8/FUyswUFS6tpdq5B5ty/H9uU9+lZ/iY44X92sv0bKyZR FyCRbHGO9wKQsS+KaiSfk6fjhgaVbgZ3csy84EpWlVvgjG8YbMp2E1bRONCx7joKhrFT fgbFCcCBEHZLg4ujshUEkzIPwKG7zacpaOBPeP/3uweuUnh4D/616+hC65KocGucwirZ aQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0dw2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:00 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPctv021444; Thu, 11 Jul 2019 14:25:56 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 04/26] mm/asi: Functions to track buffers allocated for an ASI page-table Date: Thu, 11 Jul 2019 16:25:16 +0200 Message-Id: <1562855138-19507-5-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add functions to track buffers allocated for an ASI page-table. An ASI page-table can have direct references to the kernel page table, at different levels (PGD, P4D, PUD, PMD). When freeing an ASI page-table, we should make sure that we free parts actually allocated for the ASI page-table, and not parts of the kernel page table referenced from the ASI page-table. To do so, we will keep track of buffers when building the ASI page-table. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 26 +++++++++++ arch/x86/mm/Makefile | 2 +- arch/x86/mm/asi.c | 3 + arch/x86/mm/asi_pagetable.c | 99 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 129 insertions(+), 1 deletions(-) create mode 100644 arch/x86/mm/asi_pagetable.c diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 013d77a..3d965e6 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -8,12 +8,35 @@ #include #include +#include + +enum page_table_level { + PGT_LEVEL_PTE, + PGT_LEVEL_PMD, + PGT_LEVEL_PUD, + PGT_LEVEL_P4D, + PGT_LEVEL_PGD +}; #define ASI_FAULT_LOG_SIZE 128 struct asi { spinlock_t lock; /* protect all attributes */ pgd_t *pgd; /* ASI page-table */ + + /* + * An ASI page-table can have direct references to the full kernel + * page-table, at different levels (PGD, P4D, PUD, PMD). When freeing + * an ASI page-table, we should make sure that we free parts actually + * allocated for the ASI page-table, and not part of the full kernel + * page-table referenced from the ASI page-table. + * + * To do so, the backend_pages XArray is used to keep track of pages + * used for the kernel isolation page-table. + */ + struct xarray backend_pages; /* page-table pages */ + unsigned long backend_pages_count; /* pages count */ + spinlock_t fault_lock; /* protect fault_log */ unsigned long fault_log[ASI_FAULT_LOG_SIZE]; bool fault_stack; /* display stack of fault? */ @@ -43,6 +66,9 @@ struct asi_session { DECLARE_PER_CPU_PAGE_ALIGNED(struct asi_session, cpu_asi_session); +void asi_init_backend(struct asi *asi); +void asi_fini_backend(struct asi *asi); + extern struct asi *asi_create(void); extern void asi_destroy(struct asi *asi); extern int asi_enter(struct asi *asi); diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index dae5c8a..b972f0f 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -49,7 +49,7 @@ obj-$(CONFIG_X86_INTEL_MPX) += mpx.o obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o -obj-$(CONFIG_ADDRESS_SPACE_ISOLATION) += asi.o +obj-$(CONFIG_ADDRESS_SPACE_ISOLATION) += asi.o asi_pagetable.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 717160d..dfde245 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -111,6 +111,7 @@ struct asi *asi_create(void) asi->pgd = page_address(page); spin_lock_init(&asi->lock); spin_lock_init(&asi->fault_lock); + asi_init_backend(asi); err = asi_init_mapping(asi); if (err) @@ -132,6 +133,8 @@ void asi_destroy(struct asi *asi) if (asi->pgd) free_page((unsigned long)asi->pgd); + asi_fini_backend(asi); + kfree(asi); } EXPORT_SYMBOL(asi_destroy); diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c new file mode 100644 index 0000000..7a8f791 --- /dev/null +++ b/arch/x86/mm/asi_pagetable.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved. + * + */ + +#include + +/* + * Get the pointer to the beginning of a page table directory from a page + * table directory entry. + */ +#define ASI_BACKEND_PAGE_ALIGN(entry) \ + ((typeof(entry))(((unsigned long)(entry)) & PAGE_MASK)) + +/* + * Pages used to build the address space isolation page-table are stored + * in the backend_pages XArray. Each entry in the array is a logical OR + * of the page address and the page table level (PTE, PMD, PUD, P4D) this + * page is used for in the address space isolation page-table. + * + * As a page address is aligned with PAGE_SIZE, we have plenty of space + * for storing the page table level (which is a value between 0 and 4) in + * the low bits of the page address. + * + */ + +#define ASI_BACKEND_PAGE_ENTRY(addr, level) \ + ((typeof(addr))(((unsigned long)(addr)) | ((unsigned long)(level)))) +#define ASI_BACKEND_PAGE_ADDR(entry) \ + ((void *)(((unsigned long)(entry)) & PAGE_MASK)) +#define ASI_BACKEND_PAGE_LEVEL(entry) \ + ((enum page_table_level)(((unsigned long)(entry)) & ~PAGE_MASK)) + +static int asi_add_backend_page(struct asi *asi, void *addr, + enum page_table_level level) +{ + unsigned long index; + void *old_entry; + + if ((!addr) || ((unsigned long)addr) & ~PAGE_MASK) + return -EINVAL; + + lockdep_assert_held(&asi->lock); + index = asi->backend_pages_count; + + old_entry = xa_store(&asi->backend_pages, index, + ASI_BACKEND_PAGE_ENTRY(addr, level), + GFP_KERNEL); + if (xa_is_err(old_entry)) + return xa_err(old_entry); + if (old_entry) + return -EBUSY; + + asi->backend_pages_count++; + + return 0; +} + +void asi_init_backend(struct asi *asi) +{ + xa_init(&asi->backend_pages); +} + +void asi_fini_backend(struct asi *asi) +{ + unsigned long index; + void *entry; + + if (asi->backend_pages_count) { + xa_for_each(&asi->backend_pages, index, entry) + free_page((unsigned long)ASI_BACKEND_PAGE_ADDR(entry)); + } +} + +/* + * Check if an offset in the address space isolation page-table is valid, + * i.e. check that the offset is on a page effectively belonging to the + * address space isolation page-table. + */ +static bool asi_valid_offset(struct asi *asi, void *offset) +{ + unsigned long index; + void *addr, *entry; + bool valid; + + addr = ASI_BACKEND_PAGE_ALIGN(offset); + valid = false; + + lockdep_assert_held(&asi->lock); + xa_for_each(&asi->backend_pages, index, entry) { + if (ASI_BACKEND_PAGE_ADDR(entry) == addr) { + valid = true; + break; + } + } + + return valid; +} From patchwork Thu Jul 11 14:25:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040101 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2F4F112C for ; Thu, 11 Jul 2019 14:27:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D083E28ADC for ; Thu, 11 Jul 2019 14:27:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C3A6828AE0; Thu, 11 Jul 2019 14:27:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 723C328AD1 for ; Thu, 11 Jul 2019 14:27:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728627AbfGKO1I (ORCPT ); Thu, 11 Jul 2019 10:27:08 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:36390 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728585AbfGKO1I (ORCPT ); Thu, 11 Jul 2019 10:27:08 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOAv6001518; Thu, 11 Jul 2019 14:26:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=gEB6qC/xqft/fsE7GBzDnr3O7NBTUOw27/wZk4edBIk=; b=2lbqLqqQMw9uxcd2J0e8Od2Fbg4RGmN4xuVesYNv7Y1WX2picK7zuP05DlYEdlTfaf8/ /r7nj8HLahdOMoTA2qZgo+/gfuMNT7Tq+4PK/S1hx7st0ER2GrN8SVMswbugTm+RisIK EY8L1gj7I42ax3Fxdef1h3ib61twTnbM35uhbRL7AAUoZFNDkEIoKv7I4NP1KhgZcU/R jZEVCuE/tyUKHROCoRyc1hZZuUr29PCD1HGeFsTbcm2+VPtYBpOACfriqJHm7UaHZUcG iSP2iP8CAbWrB5YbqwtuxyxbjJXj06/ciMakhxZ1tr5dVf0YbXkVnIwILeJB0axX/ioc dQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0dwu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:08 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPctw021444; Thu, 11 Jul 2019 14:25:59 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 05/26] mm/asi: Add ASI page-table entry offset functions Date: Thu, 11 Jul 2019 16:25:17 +0200 Message-Id: <1562855138-19507-6-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=1 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add wrappers around the p4d/pud/pmd/pte offset kernel functions which ensure that page-table pointers are in the specified ASI page-table. Signed-off-by: Alexandre Chartre --- arch/x86/mm/asi_pagetable.c | 62 +++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 62 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index 7a8f791..a89e02e 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -97,3 +97,65 @@ static bool asi_valid_offset(struct asi *asi, void *offset) return valid; } + +/* + * asi_pXX_offset() functions are equivalent to kernel pXX_offset() + * functions but, in addition, they ensure that page table pointers + * are in the kernel isolation page table. Otherwise an error is + * returned. + */ + +static pte_t *asi_pte_offset(struct asi *asi, pmd_t *pmd, unsigned long addr) +{ + pte_t *pte; + + pte = pte_offset_map(pmd, addr); + if (!asi_valid_offset(asi, pte)) { + pr_err("ASI %p: PTE %px not found\n", asi, pte); + return ERR_PTR(-EINVAL); + } + + return pte; +} + +static pmd_t *asi_pmd_offset(struct asi *asi, pud_t *pud, unsigned long addr) +{ + pmd_t *pmd; + + pmd = pmd_offset(pud, addr); + if (!asi_valid_offset(asi, pmd)) { + pr_err("ASI %p: PMD %px not found\n", asi, pmd); + return ERR_PTR(-EINVAL); + } + + return pmd; +} + +static pud_t *asi_pud_offset(struct asi *asi, p4d_t *p4d, unsigned long addr) +{ + pud_t *pud; + + pud = pud_offset(p4d, addr); + if (!asi_valid_offset(asi, pud)) { + pr_err("ASI %p: PUD %px not found\n", asi, pud); + return ERR_PTR(-EINVAL); + } + + return pud; +} + +static p4d_t *asi_p4d_offset(struct asi *asi, pgd_t *pgd, unsigned long addr) +{ + p4d_t *p4d; + + p4d = p4d_offset(pgd, addr); + /* + * p4d is the same has pgd if we don't have a 5-level page table. + */ + if ((p4d != (p4d_t *)pgd) && !asi_valid_offset(asi, p4d)) { + pr_err("ASI %p: P4D %px not found\n", asi, p4d); + return ERR_PTR(-EINVAL); + } + + return p4d; +} From patchwork Thu Jul 11 14:25:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040125 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 095D6112C for ; Thu, 11 Jul 2019 14:27:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ECA9928ABE for ; Thu, 11 Jul 2019 14:27:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DD6F228AD9; Thu, 11 Jul 2019 14:27:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81F3C28ABE for ; Thu, 11 Jul 2019 14:27:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728848AbfGKO14 (ORCPT ); Thu, 11 Jul 2019 10:27:56 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:39610 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728807AbfGKO1z (ORCPT ); Thu, 11 Jul 2019 10:27:55 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO7Xi100417; Thu, 11 Jul 2019 14:26:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=lCtyGi9lNXZNoOqW5lwUfawQTHm8Gs8rgqwuOoYVSXg=; b=lJAKdulhKeHbnYjRsyD5nMQD+gRQsuPOeXQdtSSCbufwUKXgsEfUN2Rz5GYHTyi0V/o7 /7wfZKsgL4wiSMWTnhXm2xWq6UxcJMo9Wcw35HV4xUxxM34clQuAg7xvJmhbR4yOZmCW HtxhTH15oHUeFU7K6yML/HzF46ooAQ0oDiETifZF8/K0Eb+1V8sYTyAXh2PY56Cglqe0 4to8Zt1JMHQxjc6IQGJB++dychz0/PvHCxVVbx1+c6+qGz1P2iNzE3+TWmWcb+jtnmaE 8hDeydy1NGeaojoYQskyTsqpWzZDVRocAyhp4SKVywbNL71DbkjE7rT5ZK2hAsuvr/wq nA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0c74-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:11 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPctx021444; Thu, 11 Jul 2019 14:26:02 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 06/26] mm/asi: Add ASI page-table entry allocation functions Date: Thu, 11 Jul 2019 16:25:18 +0200 Message-Id: <1562855138-19507-7-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add functions to allocate p4d/pud/pmd/pte pages for an ASI page-table and keep track of them. Signed-off-by: Alexandre Chartre --- arch/x86/mm/asi_pagetable.c | 111 +++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 111 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index a89e02e..0fc6d59 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -4,6 +4,8 @@ * */ +#include + #include /* @@ -159,3 +161,112 @@ static bool asi_valid_offset(struct asi *asi, void *offset) return p4d; } + +/* + * asi_pXX_alloc() functions are equivalent to kernel pXX_alloc() functions + * but, in addition, they keep track of new pages allocated for the specified + * ASI. + */ + +static pte_t *asi_pte_alloc(struct asi *asi, pmd_t *pmd, unsigned long addr) +{ + struct page *page; + pte_t *pte; + int err; + + if (pmd_none(*pmd)) { + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return ERR_PTR(-ENOMEM); + pte = (pte_t *)page_address(page); + err = asi_add_backend_page(asi, pte, PGT_LEVEL_PTE); + if (err) { + free_page((unsigned long)pte); + return ERR_PTR(err); + } + set_pmd_safe(pmd, __pmd(__pa(pte) | _KERNPG_TABLE)); + pte = pte_offset_map(pmd, addr); + } else { + pte = asi_pte_offset(asi, pmd, addr); + } + + return pte; +} + +static pmd_t *asi_pmd_alloc(struct asi *asi, pud_t *pud, unsigned long addr) +{ + struct page *page; + pmd_t *pmd; + int err; + + if (pud_none(*pud)) { + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return ERR_PTR(-ENOMEM); + pmd = (pmd_t *)page_address(page); + err = asi_add_backend_page(asi, pmd, PGT_LEVEL_PMD); + if (err) { + free_page((unsigned long)pmd); + return ERR_PTR(err); + } + set_pud_safe(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); + pmd = pmd_offset(pud, addr); + } else { + pmd = asi_pmd_offset(asi, pud, addr); + } + + return pmd; +} + +static pud_t *asi_pud_alloc(struct asi *asi, p4d_t *p4d, unsigned long addr) +{ + struct page *page; + pud_t *pud; + int err; + + if (p4d_none(*p4d)) { + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return ERR_PTR(-ENOMEM); + pud = (pud_t *)page_address(page); + err = asi_add_backend_page(asi, pud, PGT_LEVEL_PUD); + if (err) { + free_page((unsigned long)pud); + return ERR_PTR(err); + } + set_p4d_safe(p4d, __p4d(__pa(pud) | _KERNPG_TABLE)); + pud = pud_offset(p4d, addr); + } else { + pud = asi_pud_offset(asi, p4d, addr); + } + + return pud; +} + +static p4d_t *asi_p4d_alloc(struct asi *asi, pgd_t *pgd, unsigned long addr) +{ + struct page *page; + p4d_t *p4d; + int err; + + if (!pgtable_l5_enabled()) + return (p4d_t *)pgd; + + if (pgd_none(*pgd)) { + page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return ERR_PTR(-ENOMEM); + p4d = (p4d_t *)page_address(page); + err = asi_add_backend_page(asi, p4d, PGT_LEVEL_P4D); + if (err) { + free_page((unsigned long)p4d); + return ERR_PTR(err); + } + set_pgd_safe(pgd, __pgd(__pa(p4d) | _KERNPG_TABLE)); + p4d = p4d_offset(pgd, addr); + } else { + p4d = asi_p4d_offset(asi, pgd, addr); + } + + return p4d; +} From patchwork Thu Jul 11 14:25:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040177 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9751D1395 for ; Thu, 11 Jul 2019 14:30:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 850F628ABE for ; Thu, 11 Jul 2019 14:30:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7842628AD1; Thu, 11 Jul 2019 14:30:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1205928ABE for ; Thu, 11 Jul 2019 14:30:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728718AbfGKO1e (ORCPT ); Thu, 11 Jul 2019 10:27:34 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:41778 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728440AbfGKO1c (ORCPT ); Thu, 11 Jul 2019 10:27:32 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO7tr013226; Thu, 11 Jul 2019 14:26:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=385wpu4NDusZzlrvO4wdMoGEKubdnUYw4N6zqr4muDU=; b=j/8IHCr9S0gUaWnFi3Upjzuz55gy9Ljh6nfaWePnOoytFJ9B7SnJKULz74C3ywY6RXqi e2nlqVa9W7MNYSdP2Vf2C/oYboYtNcwS8EW9VbU8cIsaN6Fl9HN5Irx/xBWUPbwuO3Rb pA8nC/jt3qZSlX5P9mWE6i1BavyZZ35GVGnmvivmjOvqS8a8/Z0v3tfm2TkPowKHagRD jG7N8dpzCgLK0VRzFa0NpGxGayMyo+vj0XV5/GWaoX6BlSffhi7TdWAzW0NBMvhvqMgW Mi7BkTLlSEZlEqWrGTNR/OK6j6uUWeCyQ/O4mFUX/8vA2YAylUPyhsIeKxT9HDn0UNu4 +A== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2tjm9r0bn2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:09 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu0021444; Thu, 11 Jul 2019 14:26:06 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 07/26] mm/asi: Add ASI page-table entry set functions Date: Thu, 11 Jul 2019 16:25:19 +0200 Message-Id: <1562855138-19507-8-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add wrappers around the page table entry (pgd/p4d/pud/pmd) set functions which check that an existing entry is not being overwritten. Signed-off-by: Alexandre Chartre --- arch/x86/mm/asi_pagetable.c | 124 +++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 124 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index 0fc6d59..e17af9e 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -270,3 +270,127 @@ static bool asi_valid_offset(struct asi *asi, void *offset) return p4d; } + +/* + * asi_set_pXX() functions are equivalent to kernel set_pXX() functions + * but, in addition, they ensure that they are not overwriting an already + * existing reference in the page table. Otherwise an error is returned. + */ +static int asi_set_pte(struct asi *asi, pte_t *pte, pte_t pte_value) +{ +#ifdef DEBUG + /* + * The pte pointer should come from asi_pte_alloc() or asi_pte_offset() + * both of which check if the pointer is in the kernel isolation page + * table. So this is a paranoid check to ensure the pointer is really + * in the kernel page table. + */ + if (!asi_valid_offset(asi, pte)) { + pr_err("ASI %p: PTE %px not found\n", asi, pte); + return -EINVAL; + } +#endif + set_pte(pte, pte_value); + + return 0; +} + +static int asi_set_pmd(struct asi *asi, pmd_t *pmd, pmd_t pmd_value) +{ +#ifdef DEBUG + /* + * The pmd pointer should come from asi_pmd_alloc() or asi_pmd_offset() + * both of which check if the pointer is in the kernel isolation page + * table. So this is a paranoid check to ensure the pointer is really + * in the kernel page table. + */ + if (!asi_valid_offset(asi, pmd)) { + pr_err("ASI %p: PMD %px not found\n", asi, pmd); + return -EINVAL; + } +#endif + if (pmd_val(*pmd) == pmd_val(pmd_value)) + return 0; + + if (!pmd_none(*pmd)) { + pr_err("ASI %p: PMD %px overwriting %lx with %lx\n", + asi, pmd, pmd_val(*pmd), pmd_val(pmd_value)); + return -EBUSY; + } + + set_pmd(pmd, pmd_value); + + return 0; +} + +static int asi_set_pud(struct asi *asi, pud_t *pud, pud_t pud_value) +{ +#ifdef DEBUG + /* + * The pud pointer should come from asi_pud_alloc() or asi_pud_offset() + * both of which check if the pointer is in the kernel isolation page + * table. So this is a paranoid check to ensure the pointer is really + * in the kernel page table. + */ + if (!asi_valid_offset(asi, pud)) { + pr_err("ASI %p: PUD %px not found\n", asi, pud); + return -EINVAL; + } +#endif + if (pud_val(*pud) == pud_val(pud_value)) + return 0; + + if (!pud_none(*pud)) { + pr_err("ASI %p: PUD %px overwriting %lx with %lx\n", + asi, pud, pud_val(*pud), pud_val(pud_value)); + return -EBUSY; + } + + set_pud(pud, pud_value); + + return 0; +} + +static int asi_set_p4d(struct asi *asi, p4d_t *p4d, p4d_t p4d_value) +{ +#ifdef DEBUG + /* + * The p4d pointer should come from asi_p4d_alloc() or asi_p4d_offset() + * both of which check if the pointer is in the kernel isolation page + * table. So this is a paranoid check to ensure the pointer is really + * in the kernel page table. + */ + if (!asi_valid_offset(asi, p4d)) { + pr_err("ASI %p: P4D %px not found\n", asi, p4d); + return -EINVAL; + } +#endif + if (p4d_val(*p4d) == p4d_val(p4d_value)) + return 0; + + if (!p4d_none(*p4d)) { + pr_err("ASI %p: P4D %px overwriting %lx with %lx\n", + asi, p4d, p4d_val(*p4d), p4d_val(p4d_value)); + return -EBUSY; + } + + set_p4d(p4d, p4d_value); + + return 0; +} + +static int asi_set_pgd(struct asi *asi, pgd_t *pgd, pgd_t pgd_value) +{ + if (pgd_val(*pgd) == pgd_val(pgd_value)) + return 0; + + if (!pgd_none(*pgd)) { + pr_err("ASI %p: PGD %px overwriting %lx with %lx\n", + asi, pgd, pgd_val(*pgd), pgd_val(pgd_value)); + return -EBUSY; + } + + set_pgd(pgd, pgd_value); + + return 0; +} From patchwork Thu Jul 11 14:25:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040149 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5515C112C for ; Thu, 11 Jul 2019 14:29:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4580228ABE for ; Thu, 11 Jul 2019 14:29:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3908828AD1; Thu, 11 Jul 2019 14:29:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A1D5E28ABE for ; Thu, 11 Jul 2019 14:29:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729000AbfGKO21 (ORCPT ); Thu, 11 Jul 2019 10:28:27 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37906 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728969AbfGKO2Z (ORCPT ); Thu, 11 Jul 2019 10:28:25 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOMPI001606; Thu, 11 Jul 2019 14:26:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=y9IAz60J7qXRlFN2NRsHUzd+qvZ9ZunbWoLTMdcwAx8=; b=SmfOq+K7kq1i3rKVZwlHKx3Hxs01appLJg4uPnU/ZriVoOVHCERIDHALooM0c4fIGSBJ 6sJUwYFar/i1eRJ0bDwA5eXmBBC3pkJst2pWeRiO5XvbO/TYtBqcsljdxwFDY+ygijHk H0MYZu44ra2JBIbub0jXMvwYLNtqfrcVe7JBvlbVaiEu0d5hM3uanYLNWIvfENEj46Wm gz3Qt8PeZrmim/eo4JN+pfX8fKIq4/XNikWq8u1EuUds0UNZgKC6NbZKX2Q0LuHEumqL b5/SpV0/Ww9/ELn0rWmcvfcCc+XgXWqEmOxEsU2TT2yDZ4Zr37YzZrqonMAsPGVng5cD rQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0dx7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:12 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu1021444; Thu, 11 Jul 2019 14:26:09 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 08/26] mm/asi: Functions to populate an ASI page-table from a VA range Date: Thu, 11 Jul 2019 16:25:20 +0200 Message-Id: <1562855138-19507-9-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=917 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Provide functions to copy page-table entries from the kernel page-table to an ASI page-table for a specified VA range. These functions are based on the copy_pxx_range() functions defined in mm/memory.c. A difference is that a level parameter can be specified to indicate the page-table level (PGD, P4D, PUD PMD, PTE) at which the copy should be done. Also functions don't rely on mm or vma, and they don't alter the source page-table even if an entry is bad. Also the VA range start and size don't need to be page-aligned. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 4 + arch/x86/mm/asi_pagetable.c | 205 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 209 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 3d965e6..19656aa 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -76,6 +76,10 @@ struct asi_session { extern bool asi_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address); +extern int asi_map_range(struct asi *asi, void *ptr, size_t size, + enum page_table_level level); +extern int asi_map(struct asi *asi, void *ptr, unsigned long size); + /* * Function to exit the current isolation. This is used to abort isolation * when a task using isolation is scheduled out. diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index e17af9e..0169395 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -394,3 +394,208 @@ static int asi_set_pgd(struct asi *asi, pgd_t *pgd, pgd_t pgd_value) return 0; } + +static int asi_copy_pte_range(struct asi *asi, pmd_t *dst_pmd, pmd_t *src_pmd, + unsigned long addr, unsigned long end) +{ + pte_t *src_pte, *dst_pte; + + dst_pte = asi_pte_alloc(asi, dst_pmd, addr); + if (IS_ERR(dst_pte)) + return PTR_ERR(dst_pte); + + addr &= PAGE_MASK; + src_pte = pte_offset_map(src_pmd, addr); + + do { + asi_set_pte(asi, dst_pte, *src_pte); + + } while (dst_pte++, src_pte++, addr += PAGE_SIZE, addr < end); + + return 0; +} + +static int asi_copy_pmd_range(struct asi *asi, pud_t *dst_pud, pud_t *src_pud, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + pmd_t *src_pmd, *dst_pmd; + unsigned long next; + int err; + + dst_pmd = asi_pmd_alloc(asi, dst_pud, addr); + if (IS_ERR(dst_pmd)) + return PTR_ERR(dst_pmd); + + src_pmd = pmd_offset(src_pud, addr); + + do { + next = pmd_addr_end(addr, end); + if (level == PGT_LEVEL_PMD || pmd_none(*src_pmd) || + pmd_trans_huge(*src_pmd) || pmd_devmap(*src_pmd)) { + err = asi_set_pmd(asi, dst_pmd, *src_pmd); + if (err) + return err; + continue; + } + + if (!pmd_present(*src_pmd)) { + pr_warn("ASI %p: PMD not present for [%lx,%lx]\n", + asi, addr, next - 1); + pmd_clear(dst_pmd); + continue; + } + + err = asi_copy_pte_range(asi, dst_pmd, src_pmd, addr, next); + if (err) { + pr_err("ASI %p: PMD error copying PTE addr=%lx next=%lx\n", + asi, addr, next); + return err; + } + + } while (dst_pmd++, src_pmd++, addr = next, addr < end); + + return 0; +} + +static int asi_copy_pud_range(struct asi *asi, p4d_t *dst_p4d, p4d_t *src_p4d, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + pud_t *src_pud, *dst_pud; + unsigned long next; + int err; + + dst_pud = asi_pud_alloc(asi, dst_p4d, addr); + if (IS_ERR(dst_pud)) + return PTR_ERR(dst_pud); + + src_pud = pud_offset(src_p4d, addr); + + do { + next = pud_addr_end(addr, end); + if (level == PGT_LEVEL_PUD || pud_none(*src_pud) || + pud_trans_huge(*src_pud) || pud_devmap(*src_pud)) { + err = asi_set_pud(asi, dst_pud, *src_pud); + if (err) + return err; + continue; + } + + err = asi_copy_pmd_range(asi, dst_pud, src_pud, addr, next, + level); + if (err) { + pr_err("ASI %p: PUD error copying PMD addr=%lx next=%lx\n", + asi, addr, next); + return err; + } + + } while (dst_pud++, src_pud++, addr = next, addr < end); + + return 0; +} + +static int asi_copy_p4d_range(struct asi *asi, pgd_t *dst_pgd, pgd_t *src_pgd, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + p4d_t *src_p4d, *dst_p4d; + unsigned long next; + int err; + + dst_p4d = asi_p4d_alloc(asi, dst_pgd, addr); + if (IS_ERR(dst_p4d)) + return PTR_ERR(dst_p4d); + + src_p4d = p4d_offset(src_pgd, addr); + + do { + next = p4d_addr_end(addr, end); + if (level == PGT_LEVEL_P4D || p4d_none(*src_p4d)) { + err = asi_set_p4d(asi, dst_p4d, *src_p4d); + if (err) + return err; + continue; + } + + err = asi_copy_pud_range(asi, dst_p4d, src_p4d, addr, next, + level); + if (err) { + pr_err("ASI %p: P4D error copying PUD addr=%lx next=%lx\n", + asi, addr, next); + return err; + } + + } while (dst_p4d++, src_p4d++, addr = next, addr < end); + + return 0; +} + +static int asi_copy_pgd_range(struct asi *asi, + pgd_t *dst_pagetable, pgd_t *src_pagetable, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + pgd_t *src_pgd, *dst_pgd; + unsigned long next; + int err; + + dst_pgd = pgd_offset_pgd(dst_pagetable, addr); + src_pgd = pgd_offset_pgd(src_pagetable, addr); + + do { + next = pgd_addr_end(addr, end); + if (level == PGT_LEVEL_PGD || pgd_none(*src_pgd)) { + err = asi_set_pgd(asi, dst_pgd, *src_pgd); + if (err) + return err; + continue; + } + + err = asi_copy_p4d_range(asi, dst_pgd, src_pgd, addr, next, + level); + if (err) { + pr_err("ASI %p: PGD error copying P4D addr=%lx next=%lx\n", + asi, addr, next); + return err; + } + + } while (dst_pgd++, src_pgd++, addr = next, addr < end); + + return 0; +} + +/* + * Copy page table entries from the current page table (i.e. from the + * kernel page table) to the specified ASI page-table. The level + * parameter specifies the page-table level (PGD, P4D, PUD PMD, PTE) + * at which the copy should be done. + */ +int asi_map_range(struct asi *asi, void *ptr, size_t size, + enum page_table_level level) +{ + unsigned long addr = (unsigned long)ptr; + unsigned long end = addr + ((unsigned long)size); + unsigned long flags; + int err; + + pr_debug("ASI %p: MAP %px/%lx/%d\n", asi, ptr, size, level); + + spin_lock_irqsave(&asi->lock, flags); + err = asi_copy_pgd_range(asi, asi->pgd, current->mm->pgd, + addr, end, level); + spin_unlock_irqrestore(&asi->lock, flags); + + return err; +} +EXPORT_SYMBOL(asi_map_range); + +/* + * Copy page-table PTE entries from the current page-table to the + * specified ASI page-table. + */ +int asi_map(struct asi *asi, void *ptr, unsigned long size) +{ + return asi_map_range(asi, ptr, size, PGT_LEVEL_PTE); +} +EXPORT_SYMBOL(asi_map); From patchwork Thu Jul 11 14:25:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040175 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7366414E5 for ; Thu, 11 Jul 2019 14:29:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5EBFD28ABE for ; Thu, 11 Jul 2019 14:29:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 52E6B28AC8; Thu, 11 Jul 2019 14:29:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0ABB028AD9 for ; Thu, 11 Jul 2019 14:29:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728744AbfGKO1h (ORCPT ); Thu, 11 Jul 2019 10:27:37 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:41874 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728738AbfGKO1g (ORCPT ); Thu, 11 Jul 2019 10:27:36 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOE8u013326; Thu, 11 Jul 2019 14:26:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=x/IjzSr0NJXbBWkc/GbWFGd8aRvvo+k/bAxRCg2smaA=; b=UpGsaKxVuo2eIaT3MIyMj6uwvZYdjod6xr9sIWH6GLVIiGJo5xv5khjwyLVyDDuy1GvG yfeeGSjBWTuaMV/ezPA0NULlZQYuPGQftx9frpdHWZCWGt6EzWWMYPjn5VPHT47SjUpI A1Zvxp/4R+waiK8CFltSblp1Oe52CtoJA0LD+OBZSfdqHFgWhRLUnvbf0i9N2AFU9yvt 42HuhPrb/rYgNn3SpXtQLfgcHyAjGcS15pNxoiZQoNGdL0uAmA7B00OD3j6dk0aaUwQn KjZFUxYUuNzdESj/m6+MVakQNxN1UqdtuIgCsY0XLrDA5xHgNra+3zY3ilOMR4T3aYcQ vg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2tjm9r0bpk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:21 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu2021444; Thu, 11 Jul 2019 14:26:12 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 09/26] mm/asi: Helper functions to map module into ASI Date: Thu, 11 Jul 2019 16:25:21 +0200 Message-Id: <1562855138-19507-10-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=896 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add helper functions to easily map a module into an ASI. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 21 +++++++++++++++++++++ 1 files changed, 21 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 19656aa..b5dbc49 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -6,6 +6,7 @@ #ifndef __ASSEMBLY__ +#include #include #include #include @@ -81,6 +82,26 @@ extern int asi_map_range(struct asi *asi, void *ptr, size_t size, extern int asi_map(struct asi *asi, void *ptr, unsigned long size); /* + * Copy the memory mapping for the current module. This is defined as a + * macro to ensure it is expanded in the module making the call so that + * THIS_MODULE has the correct value. + */ +#define ASI_MAP_THIS_MODULE(asi) \ + (asi_map(asi, THIS_MODULE->core_layout.base, \ + THIS_MODULE->core_layout.size)) + +static inline int asi_map_module(struct asi *asi, char *module_name) +{ + struct module *module; + + module = find_module(module_name); + if (!module) + return -ESRCH; + + return asi_map(asi, module->core_layout.base, module->core_layout.size); +} + +/* * Function to exit the current isolation. This is used to abort isolation * when a task using isolation is scheduled out. */ From patchwork Thu Jul 11 14:25:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040139 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC9671395 for ; Thu, 11 Jul 2019 14:28:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CCDAB28ABE for ; Thu, 11 Jul 2019 14:28:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BE28028AD1; Thu, 11 Jul 2019 14:28:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C8F528ABE for ; Thu, 11 Jul 2019 14:28:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729093AbfGKO2v (ORCPT ); Thu, 11 Jul 2019 10:28:51 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:40570 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729087AbfGKO2s (ORCPT ); Thu, 11 Jul 2019 10:28:48 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO7gN100410; Thu, 11 Jul 2019 14:26:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=d9A9L3caok6Qomoe3AKAL7r9AlPWZpUNUtRWqLcduag=; b=MARahqNw+qo/sURXfHbD3S/83gqCOgFJ/swtRFnPmTZXmwoCOVR/N/i1pIKlToYWY/Ri k2aBJrN9/v+khzouzf0G9zHxuoQwHwjL5cB4NqEmK6LZPn4Du9L8R3LvRixNs0Ek2k51 TpfCDdoT/V5OQHZtXE2PjUdEaibVkBeu3OSq6a2iDA4XeaUx9Z+Ps6cN0b9Yifm/yA4j Iy9lSi7ma1G+usDIN0MJA79lNYFYSEqhzzbTW8owPRRK6v2c+xTImcLNpv1ndNvar4Yf gffkGfUVZ70Om45QblhdamFaGWrmGQZSauAg6M3MpWqBnQzN6fxwcMN6Aymg/BBEx4rs vQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0c8w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:23 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu3021444; Thu, 11 Jul 2019 14:26:15 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 10/26] mm/asi: Keep track of VA ranges mapped in ASI page-table Date: Thu, 11 Jul 2019 16:25:22 +0200 Message-Id: <1562855138-19507-11-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=882 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add functions to keep track of VA ranges mapped in an ASI page-table. This will be used when unmapping to ensure the same range is unmapped, at the same page-table level. This is also be used to handle mapping and unmapping of overlapping VA ranges. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 3 ++ arch/x86/mm/asi.c | 3 ++ arch/x86/mm/asi_pagetable.c | 71 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 77 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index b5dbc49..be1c190 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -24,6 +24,7 @@ enum page_table_level { struct asi { spinlock_t lock; /* protect all attributes */ pgd_t *pgd; /* ASI page-table */ + struct list_head mapping_list; /* list of VA range mapping */ /* * An ASI page-table can have direct references to the full kernel @@ -69,6 +70,8 @@ struct asi_session { void asi_init_backend(struct asi *asi); void asi_fini_backend(struct asi *asi); +void asi_init_range_mapping(struct asi *asi); +void asi_fini_range_mapping(struct asi *asi); extern struct asi *asi_create(void); extern void asi_destroy(struct asi *asi); diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index dfde245..25633a6 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -104,6 +104,8 @@ struct asi *asi_create(void) if (!asi) return NULL; + asi_init_range_mapping(asi); + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); if (!page) goto error; @@ -133,6 +135,7 @@ void asi_destroy(struct asi *asi) if (asi->pgd) free_page((unsigned long)asi->pgd); + asi_fini_range_mapping(asi); asi_fini_backend(asi); kfree(asi); diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index 0169395..a09a22d 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -5,10 +5,21 @@ */ #include +#include #include /* + * Structure to keep track of address ranges mapped into an ASI. + */ +struct asi_range_mapping { + struct list_head list; + void *ptr; /* range start address */ + size_t size; /* range size */ + enum page_table_level level; /* mapping level */ +}; + +/* * Get the pointer to the beginning of a page table directory from a page * table directory entry. */ @@ -75,6 +86,39 @@ void asi_fini_backend(struct asi *asi) } } +void asi_init_range_mapping(struct asi *asi) +{ + INIT_LIST_HEAD(&asi->mapping_list); +} + +void asi_fini_range_mapping(struct asi *asi) +{ + struct asi_range_mapping *range, *range_next; + + list_for_each_entry_safe(range, range_next, &asi->mapping_list, list) { + list_del(&range->list); + kfree(range); + } +} + +/* + * Return the range mapping starting at the specified address, or NULL if + * no such range is found. + */ +static struct asi_range_mapping *asi_get_range_mapping(struct asi *asi, + void *ptr) +{ + struct asi_range_mapping *range; + + lockdep_assert_held(&asi->lock); + list_for_each_entry(range, &asi->mapping_list, list) { + if (range->ptr == ptr) + return range; + } + + return NULL; +} + /* * Check if an offset in the address space isolation page-table is valid, * i.e. check that the offset is on a page effectively belonging to the @@ -574,6 +618,7 @@ static int asi_copy_pgd_range(struct asi *asi, int asi_map_range(struct asi *asi, void *ptr, size_t size, enum page_table_level level) { + struct asi_range_mapping *range_mapping; unsigned long addr = (unsigned long)ptr; unsigned long end = addr + ((unsigned long)size); unsigned long flags; @@ -582,8 +627,34 @@ int asi_map_range(struct asi *asi, void *ptr, size_t size, pr_debug("ASI %p: MAP %px/%lx/%d\n", asi, ptr, size, level); spin_lock_irqsave(&asi->lock, flags); + + /* check if the range is already mapped */ + range_mapping = asi_get_range_mapping(asi, ptr); + if (range_mapping) { + pr_debug("ASI %p: MAP %px/%lx/%d already mapped\n", + asi, ptr, size, level); + err = -EBUSY; + goto done; + } + + /* map new range */ + range_mapping = kmalloc(sizeof(*range_mapping), GFP_KERNEL); + if (!range_mapping) { + err = -ENOMEM; + goto done; + } + err = asi_copy_pgd_range(asi, asi->pgd, current->mm->pgd, addr, end, level); + if (err) + goto done; + + INIT_LIST_HEAD(&range_mapping->list); + range_mapping->ptr = ptr; + range_mapping->size = size; + range_mapping->level = level; + list_add(&range_mapping->list, &asi->mapping_list); +done: spin_unlock_irqrestore(&asi->lock, flags); return err; From patchwork Thu Jul 11 14:25:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040121 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1AE4F1395 for ; Thu, 11 Jul 2019 14:27:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B8D728A5A for ; Thu, 11 Jul 2019 14:27:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F3CFB28AC8; Thu, 11 Jul 2019 14:27:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C16C28A5A for ; Thu, 11 Jul 2019 14:27:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728760AbfGKO1h (ORCPT ); Thu, 11 Jul 2019 10:27:37 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:39270 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728440AbfGKO1g (ORCPT ); Thu, 11 Jul 2019 10:27:36 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOabr100891; Thu, 11 Jul 2019 14:26:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=Ccv45kVv21CrVP2UQjHtQ159yrdddBRkfeerFm1lQZs=; b=HGyms/pISRX5hcPdr6Z5qUYPoK8BQ52EIauuQPyhPhRre44rLyUHzvnoqOTRvXzrwhCv OTY05ptHcmicwwbGGPPXQN6m4TPthhy/xClxEZa4yF8uHSsKm7ZSH9u/tLfCQHG6hg1K +gEPReBvdFOuXQjB3QhAq8WvFQ4EtwdfdpRMWXWmCpj3REbSSnlT8iB9x+evcMVyxruc OTPrrYVoOXTGYADVN58b8Xu0sMhIRq/VtBIXKZS/5/ya9K9aD3oXpTX8QexDk1Ib94j1 ij8M3PZb4/wsMYxdM8hN/VwLMvqwLeYU8gvZc6wjF0NGEF6IBAOdWQJA2MvUwBG0KojU 6Q== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0c8m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:21 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu4021444; Thu, 11 Jul 2019 14:26:18 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 11/26] mm/asi: Functions to clear ASI page-table entries for a VA range Date: Thu, 11 Jul 2019 16:25:23 +0200 Message-Id: <1562855138-19507-12-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=905 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Provide functions to clear page-table entries in the ASI page-table for a specified VA range. Functions also check that the clearing effectively happens in the ASI page-table and there is no crossing of the ASI page-table boundary (through references to the kernel page table), so that the kernel page table is not modified by mistake. As information (address, size, page-table level) about VA ranges mapped to the ASI page-table is tracked, clearing is done with just specifying the start address of the range. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 1 + arch/x86/mm/asi_pagetable.c | 134 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 135 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index be1c190..919129f 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -83,6 +83,7 @@ extern bool asi_fault(struct pt_regs *regs, unsigned long error_code, extern int asi_map_range(struct asi *asi, void *ptr, size_t size, enum page_table_level level); extern int asi_map(struct asi *asi, void *ptr, unsigned long size); +extern void asi_unmap(struct asi *asi, void *ptr); /* * Copy the memory mapping for the current module. This is defined as a diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index a09a22d..7aee236 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -670,3 +670,137 @@ int asi_map(struct asi *asi, void *ptr, unsigned long size) return asi_map_range(asi, ptr, size, PGT_LEVEL_PTE); } EXPORT_SYMBOL(asi_map); + +static void asi_clear_pte_range(struct asi *asi, pmd_t *pmd, + unsigned long addr, unsigned long end) +{ + pte_t *pte; + + pte = asi_pte_offset(asi, pmd, addr); + if (IS_ERR(pte)) + return; + + do { + pte_clear(NULL, addr, pte); + } while (pte++, addr += PAGE_SIZE, addr < end); +} + +static void asi_clear_pmd_range(struct asi *asi, pud_t *pud, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + unsigned long next; + pmd_t *pmd; + + pmd = asi_pmd_offset(asi, pud, addr); + if (IS_ERR(pmd)) + return; + + do { + next = pmd_addr_end(addr, end); + if (pmd_none(*pmd) || pmd_present(*pmd)) + continue; + if (level == PGT_LEVEL_PMD || pmd_trans_huge(*pmd) || + pmd_devmap(*pmd)) { + pmd_clear(pmd); + continue; + } + asi_clear_pte_range(asi, pmd, addr, next); + } while (pmd++, addr = next, addr < end); +} + +static void asi_clear_pud_range(struct asi *asi, p4d_t *p4d, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + unsigned long next; + pud_t *pud; + + pud = asi_pud_offset(asi, p4d, addr); + if (IS_ERR(pud)) + return; + + do { + next = pud_addr_end(addr, end); + if (pud_none(*pud)) + continue; + if (level == PGT_LEVEL_PUD || pud_trans_huge(*pud) || + pud_devmap(*pud)) { + pud_clear(pud); + continue; + } + asi_clear_pmd_range(asi, pud, addr, next, level); + } while (pud++, addr = next, addr < end); +} + +static void asi_clear_p4d_range(struct asi *asi, pgd_t *pgd, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + unsigned long next; + p4d_t *p4d; + + p4d = asi_p4d_offset(asi, pgd, addr); + if (IS_ERR(p4d)) + return; + + do { + next = p4d_addr_end(addr, end); + if (p4d_none(*p4d)) + continue; + if (level == PGT_LEVEL_P4D) { + p4d_clear(p4d); + continue; + } + asi_clear_pud_range(asi, p4d, addr, next, level); + } while (p4d++, addr = next, addr < end); +} + +static void asi_clear_pgd_range(struct asi *asi, pgd_t *pagetable, + unsigned long addr, unsigned long end, + enum page_table_level level) +{ + unsigned long next; + pgd_t *pgd; + + pgd = pgd_offset_pgd(pagetable, addr); + do { + next = pgd_addr_end(addr, end); + if (pgd_none(*pgd)) + continue; + if (level == PGT_LEVEL_PGD) { + pgd_clear(pgd); + continue; + } + asi_clear_p4d_range(asi, pgd, addr, next, level); + } while (pgd++, addr = next, addr < end); +} + +/* + * Clear page table entries in the specified ASI page-table. + */ +void asi_unmap(struct asi *asi, void *ptr) +{ + struct asi_range_mapping *range_mapping; + unsigned long addr, end; + unsigned long flags; + + spin_lock_irqsave(&asi->lock, flags); + + range_mapping = asi_get_range_mapping(asi, ptr); + if (!range_mapping) { + pr_debug("ASI %p: UNMAP %px - not mapped\n", asi, ptr); + goto done; + } + + addr = (unsigned long)range_mapping->ptr; + end = addr + range_mapping->size; + pr_debug("ASI %p: UNMAP %px/%lx/%d\n", asi, ptr, + range_mapping->size, range_mapping->level); + asi_clear_pgd_range(asi, asi->pgd, addr, end, range_mapping->level); + list_del(&range_mapping->list); + kfree(range_mapping); +done: + spin_unlock_irqrestore(&asi->lock, flags); +} +EXPORT_SYMBOL(asi_unmap); From patchwork Thu Jul 11 14:25:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040137 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D2D11395 for ; Thu, 11 Jul 2019 14:28:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F1F2628A5A for ; Thu, 11 Jul 2019 14:28:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E5AB428AC8; Thu, 11 Jul 2019 14:28:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9604328A5A for ; Thu, 11 Jul 2019 14:28:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729082AbfGKO2q (ORCPT ); Thu, 11 Jul 2019 10:28:46 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:38288 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729074AbfGKO2q (ORCPT ); Thu, 11 Jul 2019 10:28:46 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOKHZ001595; Thu, 11 Jul 2019 14:26:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=gDByh8p+inBGj8eNOkt+0s0ErRCor6nCeXzUp1uJmZ0=; b=uIvYew5i+/N3gKnyr4ociYqq+qt56BKvpfHx2yG6ELbiXTDFF1FnRFdNiTxKLqzKpugT 7rZFZ27tp3XeOzJIDBEQWKZlGMdD04zWtCMR9kz9rtKT/I1eEKjmUoR0EfXOS4LZP41X SaOmMD0h5rmiZKjuWODlCFnTwFv/w3OU84qMp7I8vBQjvkB4kLSw+vU/HUfR9FiS4qmM dt9lswqhmPEYZvrjigJDb4qNQfKlbGFfmUQluSi/uyNCjdt/jH+m3QqAeoRgbBE5SECi sbhB6HhXJDdRZ5C1Vh+uI9HzCIOa4DMf+C+kQvxHt3TG4Fqj9spPHDE/SpvfNokpoD2Z Tw== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0dyd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:25 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu5021444; Thu, 11 Jul 2019 14:26:21 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 12/26] mm/asi: Function to copy page-table entries for percpu buffer Date: Thu, 11 Jul 2019 16:25:24 +0200 Message-Id: <1562855138-19507-13-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=895 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Provide functions to copy page-table entries from the kernel page-table to an ASI page-table for a percpu buffer. A percpu buffer have a different VA range for each cpu and all them have to be copied. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 6 ++++++ arch/x86/mm/asi_pagetable.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 919129f..912b6a7 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -105,6 +105,12 @@ static inline int asi_map_module(struct asi *asi, char *module_name) return asi_map(asi, module->core_layout.base, module->core_layout.size); } +#define ASI_MAP_CPUVAR(asi, cpuvar) \ + asi_map_percpu(asi, &cpuvar, sizeof(cpuvar)) + +extern int asi_map_percpu(struct asi *asi, void *percpu_ptr, size_t size); +extern void asi_unmap_percpu(struct asi *asi, void *percpu_ptr); + /* * Function to exit the current isolation. This is used to abort isolation * when a task using isolation is scheduled out. diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index 7aee236..a4fe867 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -804,3 +804,41 @@ void asi_unmap(struct asi *asi, void *ptr) spin_unlock_irqrestore(&asi->lock, flags); } EXPORT_SYMBOL(asi_unmap); + +void asi_unmap_percpu(struct asi *asi, void *percpu_ptr) +{ + void *ptr; + int cpu; + + pr_debug("ASI %p: UNMAP PERCPU %px\n", asi, percpu_ptr); + for_each_possible_cpu(cpu) { + ptr = per_cpu_ptr(percpu_ptr, cpu); + pr_debug("ASI %p: UNMAP PERCPU%d %px\n", asi, cpu, ptr); + asi_unmap(asi, ptr); + } +} +EXPORT_SYMBOL(asi_unmap_percpu); + +int asi_map_percpu(struct asi *asi, void *percpu_ptr, size_t size) +{ + int cpu, err; + void *ptr; + + pr_debug("ASI %p: MAP PERCPU %px\n", asi, percpu_ptr); + for_each_possible_cpu(cpu) { + ptr = per_cpu_ptr(percpu_ptr, cpu); + pr_debug("ASI %p: MAP PERCPU%d %px\n", asi, cpu, ptr); + err = asi_map(asi, ptr, size); + if (err) { + /* + * Need to unmap any percpu mapping which has + * succeeded before the failure. + */ + asi_unmap_percpu(asi, percpu_ptr); + return err; + } + } + + return 0; +} +EXPORT_SYMBOL(asi_map_percpu); From patchwork Thu Jul 11 14:25:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040173 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 06508112C for ; Thu, 11 Jul 2019 14:29:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBDEC28ABE for ; Thu, 11 Jul 2019 14:29:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E004C28AD1; Thu, 11 Jul 2019 14:29:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 94B8E28ABE for ; Thu, 11 Jul 2019 14:29:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728808AbfGKO1y (ORCPT ); Thu, 11 Jul 2019 10:27:54 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37316 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728823AbfGKO1y (ORCPT ); Thu, 11 Jul 2019 10:27:54 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOAvB001518; Thu, 11 Jul 2019 14:26:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=Ag0k+tMlXoCUCjgmP3NxD1Cl4FnNgcaLg1SLsqSglmA=; b=keVQsRLhZyNb/dFJnnzguyu+6Vl7qT1iW80eQJUi6p8HUYNezOn2M6QRB0NcNQhQI7JT 5HACwzQ7vK/fKkmWqNIFxS0mv8uBEA1J5s9pHNXY/wHvMqw/x77ya8pCk5djXXg3w8tA Y+DFHNzDwpT4d81yj2dhIMIFubn8ssOAMPMn8hTz8Cg2l5maUEKHDpqBF2nYVJbVIius zq+q7wJ21XaxrzeREUMqbHN8popuelxmyOtwjWh5plKHWNmsE8C4KEMn9NIt7tf3KMir nsOpTlKGCKH8e0WHGLCVGvgf9UFWaCxJYPAxRKOlsGhf5idexkvlsNaYd43tN8SZVOoQ SA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0dyw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:28 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu6021444; Thu, 11 Jul 2019 14:26:25 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 13/26] mm/asi: Add asi_remap() function Date: Thu, 11 Jul 2019 16:25:25 +0200 Message-Id: <1562855138-19507-14-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=832 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a function to remap an already mapped buffer with a new address in an ASI page-table: the already mapped buffer is unmapped, and a new mapping is added for the specified new address. This is useful to track and remap a buffer which can be freed and then reallocated. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 1 + arch/x86/mm/asi_pagetable.c | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 912b6a7..cf5d198 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -84,6 +84,7 @@ extern int asi_map_range(struct asi *asi, void *ptr, size_t size, enum page_table_level level); extern int asi_map(struct asi *asi, void *ptr, unsigned long size); extern void asi_unmap(struct asi *asi, void *ptr); +extern int asi_remap(struct asi *asi, void **mapping, void *ptr, size_t size); /* * Copy the memory mapping for the current module. This is defined as a diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index a4fe867..1ff0c47 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -842,3 +842,28 @@ int asi_map_percpu(struct asi *asi, void *percpu_ptr, size_t size) return 0; } EXPORT_SYMBOL(asi_map_percpu); + +int asi_remap(struct asi *asi, void **current_ptrp, void *new_ptr, size_t size) +{ + void *current_ptr = *current_ptrp; + int err; + + if (current_ptr == new_ptr) { + /* no change, already mapped */ + return 0; + } + + if (current_ptr) { + asi_unmap(asi, current_ptr); + *current_ptrp = NULL; + } + + err = asi_map(asi, new_ptr, size); + if (err) + return err; + + *current_ptrp = new_ptr; + + return 0; +} +EXPORT_SYMBOL(asi_remap); From patchwork Thu Jul 11 14:25:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040123 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 93AFB1395 for ; Thu, 11 Jul 2019 14:27:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82DEF28A5A for ; Thu, 11 Jul 2019 14:27:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 751D428AC8; Thu, 11 Jul 2019 14:27:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 755C328A5A for ; Thu, 11 Jul 2019 14:27:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728812AbfGKO1t (ORCPT ); Thu, 11 Jul 2019 10:27:49 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37146 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728809AbfGKO1r (ORCPT ); Thu, 11 Jul 2019 10:27:47 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO9em001500; Thu, 11 Jul 2019 14:26:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=/snaNkVEEW4nTR1xAXfvlEoe9rj/2L0Z92puglvXclU=; b=DN3eG8DXxQ7ozWrPnuAeiliXI+C+IxsV4od5f1TGtSHKulg61bcs9xHwRh7Wbxnghz9I ThGtm4GjaEXofJ/g9FZ3+haVgxr28XtB+ucHwwPynqaJaB3a3cRcsSIJZKS7VGg7EiMw rNBOsKQ9GocLmZ4MB3gTzKkSMvJmagNALGBonkOZRxAN95EUC0/Wbwu+HXXfAxWl18C9 GkNsUrqBGLn4Ir8iOvlMsFHQtMtfTKAD2WUAOBydyAaB9K/Tb/BB4EigTZEu+vozMNyu ysx0uiZcdvKRUDvzf4/aPdWlIS0c9udhkbyOJiJWUg4LAaH97xlf2yC2mUSNfBoqVSwX 4Q== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0e09-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:32 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu7021444; Thu, 11 Jul 2019 14:26:28 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 14/26] mm/asi: Handle ASI mapped range leaks and overlaps Date: Thu, 11 Jul 2019 16:25:26 +0200 Message-Id: <1562855138-19507-15-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=857 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When mapping a buffer in an ASI page-table, data around the buffer can also be mapped if the entire buffer is not aligned with the page directory size used for the mapping. So, data can potentially leak into the ASI page-table. In such a case, print a warning that data are leaking. Also data effectively mapped can overlap with an already mapped buffer. This is not an issue when mapping data but, when unmapping, make sure data from another buffer don't get unmapped as a side effect. Signed-off-by: Alexandre Chartre --- arch/x86/mm/asi_pagetable.c | 230 +++++++++++++++++++++++++++++++++++++++---- 1 files changed, 212 insertions(+), 18 deletions(-) diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index 1ff0c47..f1ee65b 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -9,6 +9,14 @@ #include +static unsigned long page_directory_size[] = { + [PGT_LEVEL_PTE] = PAGE_SIZE, + [PGT_LEVEL_PMD] = PMD_SIZE, + [PGT_LEVEL_PUD] = PUD_SIZE, + [PGT_LEVEL_P4D] = P4D_SIZE, + [PGT_LEVEL_PGD] = PGDIR_SIZE, +}; + /* * Structure to keep track of address ranges mapped into an ASI. */ @@ -17,8 +25,16 @@ struct asi_range_mapping { void *ptr; /* range start address */ size_t size; /* range size */ enum page_table_level level; /* mapping level */ + int overlap; /* overlap count */ }; +#define ASI_RANGE_MAP_ADDR(r) \ + round_down((unsigned long)((r)->ptr), page_directory_size[(r)->level]) + +#define ASI_RANGE_MAP_END(r) \ + round_up((unsigned long)((r)->ptr + (r)->size), \ + page_directory_size[(r)->level]) + /* * Get the pointer to the beginning of a page table directory from a page * table directory entry. @@ -609,6 +625,71 @@ static int asi_copy_pgd_range(struct asi *asi, return 0; } + +/* + * Map a VA range, taking into account any overlap with already mapped + * VA ranges. On error, return < 0. Otherwise return the number of + * ranges the specified range is overlapping with. + */ +static int asi_map_overlap(struct asi *asi, void *ptr, size_t size, + enum page_table_level level) +{ + unsigned long map_addr, map_end; + unsigned long addr, end; + struct asi_range_mapping *range; + bool need_mapping; + int err, overlap; + + addr = (unsigned long)ptr; + end = addr + (unsigned long)size; + need_mapping = true; + overlap = 0; + + lockdep_assert_held(&asi->lock); + list_for_each_entry(range, &asi->mapping_list, list) { + + if (range->ptr == ptr && range->size == size) { + /* we are mapping the same range again */ + pr_debug("ASI %p: MAP %px/%lx/%d already mapped\n", + asi, ptr, size, level); + return -EBUSY; + } + + /* check overlap with mapped range */ + map_addr = ASI_RANGE_MAP_ADDR(range); + map_end = ASI_RANGE_MAP_END(range); + if (end <= map_addr || addr >= map_end) { + /* no overlap, continue */ + continue; + } + + pr_debug("ASI %p: MAP %px/%lx/%d overlaps with %px/%lx/%d\n", + asi, ptr, size, level, + range->ptr, range->size, range->level); + range->overlap++; + overlap++; + + /* + * Check if new range is included into an existing range. + * If so then the new range is already entirely mapped. + */ + if (addr >= map_addr && end <= map_end) { + pr_debug("ASI %p: MAP %px/%lx/%d implicitly mapped\n", + asi, ptr, size, level); + need_mapping = false; + } + } + + if (need_mapping) { + err = asi_copy_pgd_range(asi, asi->pgd, current->mm->pgd, + addr, end, level); + if (err) + return err; + } + + return overlap; +} + /* * Copy page table entries from the current page table (i.e. from the * kernel page table) to the specified ASI page-table. The level @@ -619,44 +700,53 @@ int asi_map_range(struct asi *asi, void *ptr, size_t size, enum page_table_level level) { struct asi_range_mapping *range_mapping; + unsigned long page_dir_size = page_directory_size[level]; unsigned long addr = (unsigned long)ptr; unsigned long end = addr + ((unsigned long)size); + unsigned long map_addr, map_end; unsigned long flags; - int err; + int err, overlap; + + map_addr = round_down(addr, page_dir_size); + map_end = round_up(end, page_dir_size); - pr_debug("ASI %p: MAP %px/%lx/%d\n", asi, ptr, size, level); + pr_debug("ASI %p: MAP %px/%lx/%d -> %lx-%lx\n", asi, ptr, size, level, + map_addr, map_end); + if (map_addr < addr) + pr_debug("ASI %p: MAP LEAK %lx-%lx\n", asi, map_addr, addr); + if (map_end > end) + pr_debug("ASI %p: MAP LEAK %lx-%lx\n", asi, end, map_end); spin_lock_irqsave(&asi->lock, flags); - /* check if the range is already mapped */ - range_mapping = asi_get_range_mapping(asi, ptr); - if (range_mapping) { - pr_debug("ASI %p: MAP %px/%lx/%d already mapped\n", - asi, ptr, size, level); - err = -EBUSY; - goto done; + /* + * Map the new range with taking overlap with already mapped ranges + * into account. + */ + overlap = asi_map_overlap(asi, ptr, size, level); + if (overlap < 0) { + err = overlap; + goto error; } - /* map new range */ + /* add new range */ range_mapping = kmalloc(sizeof(*range_mapping), GFP_KERNEL); if (!range_mapping) { err = -ENOMEM; - goto done; + goto error; } - err = asi_copy_pgd_range(asi, asi->pgd, current->mm->pgd, - addr, end, level); - if (err) - goto done; - INIT_LIST_HEAD(&range_mapping->list); range_mapping->ptr = ptr; range_mapping->size = size; range_mapping->level = level; + range_mapping->overlap = overlap; list_add(&range_mapping->list, &asi->mapping_list); -done: spin_unlock_irqrestore(&asi->lock, flags); + return 0; +error: + spin_unlock_irqrestore(&asi->lock, flags); return err; } EXPORT_SYMBOL(asi_map_range); @@ -776,6 +866,110 @@ static void asi_clear_pgd_range(struct asi *asi, pgd_t *pagetable, } while (pgd++, addr = next, addr < end); } + +/* + * Unmap a VA range, taking into account any overlap with other mapped + * VA ranges. This unmaps the specified range then remap any range this + * range was overlapping with. + */ +static void asi_unmap_overlap(struct asi *asi, struct asi_range_mapping *range) +{ + unsigned long map_addr, map_end; + struct asi_range_mapping *r; + unsigned long addr, end; + unsigned long r_addr; + bool need_unmapping; + int err, overlap; + + addr = (unsigned long)range->ptr; + end = addr + (unsigned long)range->size; + overlap = range->overlap; + need_unmapping = true; + + lockdep_assert_held(&asi->lock); + + /* + * Adjust overlap information and check if range effectively needs + * to be unmapped. + */ + list_for_each_entry(r, &asi->mapping_list, list) { + + if (!overlap) { + /* no more overlap */ + break; + } + + WARN_ON(range->ptr == r->ptr && range->size == r->size); + + /* check overlap with other range */ + map_addr = ASI_RANGE_MAP_ADDR(r); + map_end = ASI_RANGE_MAP_END(r); + if (end < map_addr || addr >= map_end) { + /* no overlap, continue */ + continue; + } + + pr_debug("ASI %p: UNMAP %px/%lx/%d overlaps with %px/%lx/%d\n", + asi, range->ptr, range->size, range->level, + r->ptr, r->size, r->level); + r->overlap--; + overlap--; + + /* + * Check if range is included into a remaining mapped range. + * If so then there's no need to unmap. + */ + if (map_addr <= addr && end <= map_end) { + pr_debug("ASI %p: UNMAP %px/%lx/%d still mapped\n", + asi, range->ptr, range->size, range->level); + need_unmapping = false; + } + } + + WARN_ON(overlap); + + if (need_unmapping) { + asi_clear_pgd_range(asi, asi->pgd, addr, end, range->level); + + /* + * Remap all range we overlap with as mapping clearing + * will have unmap the overlap. + */ + overlap = range->overlap; + list_for_each_entry(r, &asi->mapping_list, list) { + if (!overlap) { + /* no more overlap */ + break; + } + + /* check overlap with other range */ + map_addr = ASI_RANGE_MAP_ADDR(r); + map_end = ASI_RANGE_MAP_END(r); + if (end < map_addr || addr >= map_end) { + /* no overlap, continue */ + continue; + } + pr_debug("ASI %p: UNMAP %px/%lx/%d remaps %px/%lx/%d\n", + asi, range->ptr, range->size, range->level, + r->ptr, r->size, r->level); + overlap--; + + r_addr = (unsigned long)r->ptr; + err = asi_copy_pgd_range(asi, asi->pgd, + current->mm->pgd, + r_addr, r_addr + r->size, + r->level); + if (err) { + pr_debug("ASI %p: UNMAP %px/%lx/%d remaps %px/%lx/%d error %d\n", + asi, range->ptr, range->size, + range->level, + r->ptr, r->size, r->level, + err); + } + } + } +} + /* * Clear page table entries in the specified ASI page-table. */ @@ -797,8 +991,8 @@ void asi_unmap(struct asi *asi, void *ptr) end = addr + range_mapping->size; pr_debug("ASI %p: UNMAP %px/%lx/%d\n", asi, ptr, range_mapping->size, range_mapping->level); - asi_clear_pgd_range(asi, asi->pgd, addr, end, range_mapping->level); list_del(&range_mapping->list); + asi_unmap_overlap(asi, range_mapping); kfree(range_mapping); done: spin_unlock_irqrestore(&asi->lock, flags); From patchwork Thu Jul 11 14:25:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040171 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68F3914E5 for ; Thu, 11 Jul 2019 14:29:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A50C28ABE for ; Thu, 11 Jul 2019 14:29:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4E7C928AC8; Thu, 11 Jul 2019 14:29:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B3D0028AD9 for ; Thu, 11 Jul 2019 14:29:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728830AbfGKO1y (ORCPT ); Thu, 11 Jul 2019 10:27:54 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:39580 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728809AbfGKO1x (ORCPT ); Thu, 11 Jul 2019 10:27:53 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO7gP100410; Thu, 11 Jul 2019 14:26:39 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=tktXu8oSlvrRNi4oCpkMHKHtnQ30RjQ+vGHk9xchmuI=; b=Xw8imofO8fKNfBvyII53RRKIWPHJ/UdjgH1ZSaBocnV0EADrjb1GTZYN4HsE7xrVPRc/ TEHR2f4ozNEvHmT4xkzmrRWShe8NJkdGTGU5UyvTwwVLuobGplkQYt22venKGQfjZsCi BuPNZ1Zy1I5Lm3q4cyPwf0rvVmiPcVoh9A/tJ1V5PFUsSDajeMDtZazluCVycnUMAv6u /YyGN39NkO9rtwovlVRiIRORlVYEoHIigSvvQZaOjXC7B6+jVAU9bBj+zUDnf/Rei8Mf yGpCGEFGnOwzX/ATnI3MWljtUlcXz+wMpxZcI3AhvYUC5PX/005SgrPaW6hQhMYqVhz6 ew== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0cat-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:39 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu8021444; Thu, 11 Jul 2019 14:26:31 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 15/26] mm/asi: Initialize the ASI page-table with core mappings Date: Thu, 11 Jul 2019 16:25:27 +0200 Message-Id: <1562855138-19507-16-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Core mappings are the minimal mappings we need to be able to enter isolation and handle an isolation abort or exit. This includes the kernel code, the GDT and the percpu ASI sessions. We also need a stack so we map the current stack when entering isolation and unmap it on exit/abort. Optionally, additional mappins can be added like the stack canary or the percpu offset to be able to use get_cpu_var()/this_cpu_ptr() when isolation is active. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 9 ++++- arch/x86/mm/asi.c | 75 +++++++++++++++++++++++++++++++++++++++--- arch/x86/mm/asi_pagetable.c | 30 ++++++++++++---- 3 files changed, 99 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index cf5d198..1ac8fd3 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -11,6 +11,13 @@ #include #include +/* + * asi_create() map flags. Flags are used to map optional data + * when creating an ASI. + */ +#define ASI_MAP_STACK_CANARY 0x01 /* map stack canary */ +#define ASI_MAP_CPU_PTR 0x02 /* for get_cpu_var()/this_cpu_ptr() */ + enum page_table_level { PGT_LEVEL_PTE, PGT_LEVEL_PMD, @@ -73,7 +80,7 @@ struct asi_session { void asi_init_range_mapping(struct asi *asi); void asi_fini_range_mapping(struct asi *asi); -extern struct asi *asi_create(void); +extern struct asi *asi_create(int map_flags); extern void asi_destroy(struct asi *asi); extern int asi_enter(struct asi *asi); extern void asi_exit(struct asi *asi); diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 25633a6..f049438 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -19,6 +19,17 @@ /* ASI sessions, one per cpu */ DEFINE_PER_CPU_PAGE_ALIGNED(struct asi_session, cpu_asi_session); +struct asi_map_option { + int flag; + void *ptr; + size_t size; +}; + +struct asi_map_option asi_map_percpu_options[] = { + { ASI_MAP_STACK_CANARY, &fixed_percpu_data, sizeof(fixed_percpu_data) }, + { ASI_MAP_CPU_PTR, &this_cpu_off, sizeof(this_cpu_off) }, +}; + static void asi_log_fault(struct asi *asi, struct pt_regs *regs, unsigned long error_code, unsigned long address) { @@ -85,16 +96,55 @@ bool asi_fault(struct pt_regs *regs, unsigned long error_code, return true; } -static int asi_init_mapping(struct asi *asi) +static int asi_init_mapping(struct asi *asi, int flags) { + struct asi_map_option *option; + int i, err; + + /* + * Map the kernel. + * + * XXX We should check if we can map only kernel text, i.e. map with + * size = _etext - _text + */ + err = asi_map(asi, (void *)__START_KERNEL_map, KERNEL_IMAGE_SIZE); + if (err) + return err; + /* - * TODO: Populate the ASI page-table with minimal mappings so - * that we can at least enter isolation and abort. + * Map the cpu_entry_area because we need the GDT to be mapped. + * Not sure we need anything else from cpu_entry_area. */ + err = asi_map_range(asi, (void *)CPU_ENTRY_AREA_PER_CPU, P4D_SIZE, + PGT_LEVEL_P4D); + if (err) + return err; + + /* + * Map the percpu ASI sessions. This is used by interrupt handlers + * to figure out if we have entered isolation and switch back to + * the kernel address space. + */ + err = ASI_MAP_CPUVAR(asi, cpu_asi_session); + if (err) + return err; + + /* + * Optional percpu mappings. + */ + for (i = 0; i < ARRAY_SIZE(asi_map_percpu_options); i++) { + option = &asi_map_percpu_options[i]; + if (flags & option->flag) { + err = asi_map_percpu(asi, option->ptr, option->size); + if (err) + return err; + } + } + return 0; } -struct asi *asi_create(void) +struct asi *asi_create(int map_flags) { struct page *page; struct asi *asi; @@ -115,7 +165,7 @@ struct asi *asi_create(void) spin_lock_init(&asi->fault_lock); asi_init_backend(asi); - err = asi_init_mapping(asi); + err = asi_init_mapping(asi, map_flags); if (err) goto error; @@ -159,6 +209,7 @@ int asi_enter(struct asi *asi) struct asi *current_asi; struct asi_session *asi_session; unsigned long original_cr3; + int err; state = this_cpu_read(cpu_asi_session.state); /* @@ -190,6 +241,13 @@ int asi_enter(struct asi *asi) WARN_ON(asi_session->abort_depth > 0); /* + * We need a stack to run with isolation, so map the current stack. + */ + err = asi_map(asi, current->stack, PAGE_SIZE << THREAD_SIZE_ORDER); + if (err) + goto err_clear_asi; + + /* * Instructions ordering is important here because we should be * able to deal with any interrupt/exception which will abort * the isolation and restore CR3 to its original value: @@ -211,7 +269,7 @@ int asi_enter(struct asi *asi) if (!original_cr3) { WARN_ON(1); err = -EINVAL; - goto err_clear_asi; + goto err_unmap_stack; } asi_session->original_cr3 = original_cr3; @@ -228,6 +286,8 @@ int asi_enter(struct asi *asi) return 0; +err_unmap_stack: + asi_unmap(asi, current->stack); err_clear_asi: asi_session->asi = NULL; asi_session->task = NULL; @@ -284,6 +344,9 @@ void asi_exit(struct asi *asi) * exit isolation before abort_depth reaches 0. */ asi_session->abort_depth = 0; + + /* unmap stack */ + asi_unmap(asi, current->stack); } EXPORT_SYMBOL(asi_exit); diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index f1ee65b..bcc95f2 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -710,12 +710,20 @@ int asi_map_range(struct asi *asi, void *ptr, size_t size, map_addr = round_down(addr, page_dir_size); map_end = round_up(end, page_dir_size); - pr_debug("ASI %p: MAP %px/%lx/%d -> %lx-%lx\n", asi, ptr, size, level, - map_addr, map_end); - if (map_addr < addr) - pr_debug("ASI %p: MAP LEAK %lx-%lx\n", asi, map_addr, addr); - if (map_end > end) - pr_debug("ASI %p: MAP LEAK %lx-%lx\n", asi, end, map_end); + /* + * Don't log info the current stack because it is mapped/unmapped + * everytime we enter/exit isolation. + */ + if (ptr != current->stack) { + pr_debug("ASI %p: MAP %px/%lx/%d -> %lx-%lx\n", + asi, ptr, size, level, map_addr, map_end); + if (map_addr < addr) + pr_debug("ASI %p: MAP LEAK %lx-%lx\n", + asi, map_addr, addr); + if (map_end > end) + pr_debug("ASI %p: MAP LEAK %lx-%lx\n", + asi, end, map_end); + } spin_lock_irqsave(&asi->lock, flags); @@ -989,8 +997,14 @@ void asi_unmap(struct asi *asi, void *ptr) addr = (unsigned long)range_mapping->ptr; end = addr + range_mapping->size; - pr_debug("ASI %p: UNMAP %px/%lx/%d\n", asi, ptr, - range_mapping->size, range_mapping->level); + /* + * Don't log info the current stack because it is mapped/unmapped + * everytime we enter/exit isolation. + */ + if (ptr != current->stack) { + pr_debug("ASI %p: UNMAP %px/%lx/%d\n", asi, ptr, + range_mapping->size, range_mapping->level); + } list_del(&range_mapping->list); asi_unmap_overlap(asi, range_mapping); kfree(range_mapping); From patchwork Thu Jul 11 14:25:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040157 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1E04C1395 for ; Thu, 11 Jul 2019 14:29:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0D00028ABE for ; Thu, 11 Jul 2019 14:29:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 00CBA28AD1; Thu, 11 Jul 2019 14:29:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8F19128ABE for ; Thu, 11 Jul 2019 14:29:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728984AbfGKO2Y (ORCPT ); Thu, 11 Jul 2019 10:28:24 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37832 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728926AbfGKO2T (ORCPT ); Thu, 11 Jul 2019 10:28:19 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO90p001480; Thu, 11 Jul 2019 14:26:38 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=k/9Fug63afNC647KTGwGwhcj7r27BkaO1PdUgfdBHl8=; b=5KLn6vAlS8mqB4CpSvgIU+Q+dh/FIPZ30597ShKunxhzPyWufNzv9Z2Cx08UVoBiEll/ JBVMx/vKSYB0C0s+WVyLcIBN58ZXRTbwEhf/IwNYl7zZ7sSsGPjmF/6hyaDRwuF9hk/r Od3KI8aMoJde526KmMFcX2swAUblIJ6IEJr7eaGDmgCNG3lLZyVRqS3W4XYV29OC/ZVc 4mk68QurguhhWeX8FUCmmK47mMaaLROKzVLzTFajITGGIClxoo3c+fKTYG1lt2Dpitky k9uYwYmfAA4lryCWr9Zg3nU9L0blXjaO/G+ZtIQSyB+mD3low+GPiTf4kzuLZVh/PRPA cg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0e0w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:38 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcu9021444; Thu, 11 Jul 2019 14:26:34 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 16/26] mm/asi: Option to map current task into ASI Date: Thu, 11 Jul 2019 16:25:28 +0200 Message-Id: <1562855138-19507-17-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=853 adultscore=26 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add an option to map the current task into an ASI page-table. The task is mapped when entering isolation and unmapped on abort/exit. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 2 ++ arch/x86/mm/asi.c | 25 +++++++++++++++++++++---- arch/x86/mm/asi_pagetable.c | 4 ++-- 3 files changed, 25 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 1ac8fd3..a277e43 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -17,6 +17,7 @@ */ #define ASI_MAP_STACK_CANARY 0x01 /* map stack canary */ #define ASI_MAP_CPU_PTR 0x02 /* for get_cpu_var()/this_cpu_ptr() */ +#define ASI_MAP_CURRENT_TASK 0x04 /* map the current task */ enum page_table_level { PGT_LEVEL_PTE, @@ -31,6 +32,7 @@ enum page_table_level { struct asi { spinlock_t lock; /* protect all attributes */ pgd_t *pgd; /* ASI page-table */ + int mapping_flags; /* map flags */ struct list_head mapping_list; /* list of VA range mapping */ /* diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index f049438..acd1135 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -28,6 +28,7 @@ struct asi_map_option { struct asi_map_option asi_map_percpu_options[] = { { ASI_MAP_STACK_CANARY, &fixed_percpu_data, sizeof(fixed_percpu_data) }, { ASI_MAP_CPU_PTR, &this_cpu_off, sizeof(this_cpu_off) }, + { ASI_MAP_CURRENT_TASK, ¤t_task, sizeof(current_task) }, }; static void asi_log_fault(struct asi *asi, struct pt_regs *regs, @@ -96,8 +97,9 @@ bool asi_fault(struct pt_regs *regs, unsigned long error_code, return true; } -static int asi_init_mapping(struct asi *asi, int flags) +static int asi_init_mapping(struct asi *asi) { + int flags = asi->mapping_flags; struct asi_map_option *option; int i, err; @@ -164,8 +166,9 @@ struct asi *asi_create(int map_flags) spin_lock_init(&asi->lock); spin_lock_init(&asi->fault_lock); asi_init_backend(asi); + asi->mapping_flags = map_flags; - err = asi_init_mapping(asi, map_flags); + err = asi_init_mapping(asi); if (err) goto error; @@ -248,6 +251,15 @@ int asi_enter(struct asi *asi) goto err_clear_asi; /* + * Optionally, also map the current task. + */ + if (asi->mapping_flags & ASI_MAP_CURRENT_TASK) { + err = asi_map(asi, current, sizeof(struct task_struct)); + if (err) + goto err_unmap_stack; + } + + /* * Instructions ordering is important here because we should be * able to deal with any interrupt/exception which will abort * the isolation and restore CR3 to its original value: @@ -269,7 +281,7 @@ int asi_enter(struct asi *asi) if (!original_cr3) { WARN_ON(1); err = -EINVAL; - goto err_unmap_stack; + goto err_unmap_task; } asi_session->original_cr3 = original_cr3; @@ -286,6 +298,9 @@ int asi_enter(struct asi *asi) return 0; +err_unmap_task: + if (asi->mapping_flags & ASI_MAP_CURRENT_TASK) + asi_unmap(asi, current); err_unmap_stack: asi_unmap(asi, current->stack); err_clear_asi: @@ -345,8 +360,10 @@ void asi_exit(struct asi *asi) */ asi_session->abort_depth = 0; - /* unmap stack */ + /* unmap stack and task */ asi_unmap(asi, current->stack); + if (asi->mapping_flags & ASI_MAP_CURRENT_TASK) + asi_unmap(asi, current); } EXPORT_SYMBOL(asi_exit); diff --git a/arch/x86/mm/asi_pagetable.c b/arch/x86/mm/asi_pagetable.c index bcc95f2..8076626 100644 --- a/arch/x86/mm/asi_pagetable.c +++ b/arch/x86/mm/asi_pagetable.c @@ -714,7 +714,7 @@ int asi_map_range(struct asi *asi, void *ptr, size_t size, * Don't log info the current stack because it is mapped/unmapped * everytime we enter/exit isolation. */ - if (ptr != current->stack) { + if (ptr != current->stack && ptr != current) { pr_debug("ASI %p: MAP %px/%lx/%d -> %lx-%lx\n", asi, ptr, size, level, map_addr, map_end); if (map_addr < addr) @@ -1001,7 +1001,7 @@ void asi_unmap(struct asi *asi, void *ptr) * Don't log info the current stack because it is mapped/unmapped * everytime we enter/exit isolation. */ - if (ptr != current->stack) { + if (ptr != current->stack && ptr != current) { pr_debug("ASI %p: UNMAP %px/%lx/%d\n", asi, ptr, range_mapping->size, range_mapping->level); } From patchwork Thu Jul 11 14:25:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040163 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E0F914E5 for ; Thu, 11 Jul 2019 14:29:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3D45F28AC8 for ; Thu, 11 Jul 2019 14:29:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3100428AD1; Thu, 11 Jul 2019 14:29:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A477528ADC for ; Thu, 11 Jul 2019 14:29:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728835AbfGKO3Z (ORCPT ); Thu, 11 Jul 2019 10:29:25 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:43852 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728778AbfGKO3Y (ORCPT ); Thu, 11 Jul 2019 10:29:24 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO75F013247; Thu, 11 Jul 2019 14:26:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=DxKvEFI9OqvnwAFHfsS7VJpKVi8ynRLOZoMuRYswZIw=; b=k3XXPihT1uvy+OLIiRmWMdfDXIT+WHMFAjj45U6uCpQ8rV2jfdSnRJmMQA1mV5ALTQyT PERU4/YPhFf4ckQAGG2rOQDm5vbl52OROkwzviVZDqxaVQLQCpGKejd6n6Gi8g/OW973 b1WRq4uNsMv9cAwP/H6vRscq1e3iobbZ324Tq6PgoCdWFemJ/1YPJhNIMn+keJv5JOQh Ff8q9kmaJ7pYkwC9amWnZvecFXs6OmpibtOtjhm23oXx9IX6dzIXv0YiCsY7b8cavCtn S3SvjejyJfLMAMiB7WwpFazOkLIoHuippfiDTsTbnPYepjTPVENPQMOCjagrpefgERlp TA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2tjm9r0brp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:41 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuA021444; Thu, 11 Jul 2019 14:26:37 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 17/26] rcu: Move tree.h static forward declarations to tree.c Date: Thu, 11 Jul 2019 16:25:29 +0200 Message-Id: <1562855138-19507-18-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP tree.h has static forward declarations for inline function declared in tree_plugin.h and tree_stall.h. These forward declarations prevent including tree.h into a file different from tree.c Signed-off-by: Alexandre Chartre --- kernel/rcu/tree.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++ kernel/rcu/tree.h | 55 +---------------------------------------------------- 2 files changed, 55 insertions(+), 54 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 980ca3c..44dd3b4 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -55,6 +55,60 @@ #include "tree.h" #include "rcu.h" +/* Forward declarations for tree_plugin.h */ +static void rcu_bootup_announce(void); +static void rcu_qs(void); +static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); +#ifdef CONFIG_HOTPLUG_CPU +static bool rcu_preempt_has_tasks(struct rcu_node *rnp); +#endif /* #ifdef CONFIG_HOTPLUG_CPU */ +static int rcu_print_task_exp_stall(struct rcu_node *rnp); +static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp); +static void rcu_flavor_sched_clock_irq(int user); +static void dump_blkd_tasks(struct rcu_node *rnp, int ncheck); +static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); +static void rcu_preempt_boost_start_gp(struct rcu_node *rnp); +static void invoke_rcu_callbacks_kthread(void); +static bool rcu_is_callbacks_kthread(void); +static void __init rcu_spawn_boost_kthreads(void); +static void rcu_prepare_kthreads(int cpu); +static void rcu_cleanup_after_idle(void); +static void rcu_prepare_for_idle(void); +static bool rcu_preempt_has_tasks(struct rcu_node *rnp); +static bool rcu_preempt_need_deferred_qs(struct task_struct *t); +static void rcu_preempt_deferred_qs(struct task_struct *t); +static void zero_cpu_stall_ticks(struct rcu_data *rdp); +static bool rcu_nocb_cpu_needs_barrier(int cpu); +static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); +static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); +static void rcu_init_one_nocb(struct rcu_node *rnp); +static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp, + bool lazy, unsigned long flags); +static bool rcu_nocb_adopt_orphan_cbs(struct rcu_data *my_rdp, + struct rcu_data *rdp, + unsigned long flags); +static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp); +static void do_nocb_deferred_wakeup(struct rcu_data *rdp); +static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp); +static void rcu_spawn_cpu_nocb_kthread(int cpu); +static void __init rcu_spawn_nocb_kthreads(void); +#ifdef CONFIG_RCU_NOCB_CPU +static void __init rcu_organize_nocb_kthreads(void); +#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ +static bool init_nocb_callback_list(struct rcu_data *rdp); +static unsigned long rcu_get_n_cbs_nocb_cpu(struct rcu_data *rdp); +static void rcu_bind_gp_kthread(void); +static bool rcu_nohz_full_cpu(void); +static void rcu_dynticks_task_enter(void); +static void rcu_dynticks_task_exit(void); + +/* Forward declarations for tree_stall.h */ +static void record_gp_stall_check_time(void); +static void rcu_iw_handler(struct irq_work *iwp); +static void check_cpu_stall(struct rcu_data *rdp); +static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp, + const unsigned long gpssdelay); + #ifdef MODULE_PARAM_PREFIX #undef MODULE_PARAM_PREFIX #endif diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index e253d11..9790b58 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -392,58 +392,5 @@ struct rcu_state { #endif /* #else #ifdef CONFIG_TRACING */ int rcu_dynticks_snap(struct rcu_data *rdp); - -/* Forward declarations for tree_plugin.h */ -static void rcu_bootup_announce(void); -static void rcu_qs(void); -static int rcu_preempt_blocked_readers_cgp(struct rcu_node *rnp); -#ifdef CONFIG_HOTPLUG_CPU -static bool rcu_preempt_has_tasks(struct rcu_node *rnp); -#endif /* #ifdef CONFIG_HOTPLUG_CPU */ -static int rcu_print_task_exp_stall(struct rcu_node *rnp); -static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp); -static void rcu_flavor_sched_clock_irq(int user); void call_rcu(struct rcu_head *head, rcu_callback_t func); -static void dump_blkd_tasks(struct rcu_node *rnp, int ncheck); -static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); -static void rcu_preempt_boost_start_gp(struct rcu_node *rnp); -static void invoke_rcu_callbacks_kthread(void); -static bool rcu_is_callbacks_kthread(void); -static void __init rcu_spawn_boost_kthreads(void); -static void rcu_prepare_kthreads(int cpu); -static void rcu_cleanup_after_idle(void); -static void rcu_prepare_for_idle(void); -static bool rcu_preempt_has_tasks(struct rcu_node *rnp); -static bool rcu_preempt_need_deferred_qs(struct task_struct *t); -static void rcu_preempt_deferred_qs(struct task_struct *t); -static void zero_cpu_stall_ticks(struct rcu_data *rdp); -static bool rcu_nocb_cpu_needs_barrier(int cpu); -static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp); -static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq); -static void rcu_init_one_nocb(struct rcu_node *rnp); -static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp, - bool lazy, unsigned long flags); -static bool rcu_nocb_adopt_orphan_cbs(struct rcu_data *my_rdp, - struct rcu_data *rdp, - unsigned long flags); -static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp); -static void do_nocb_deferred_wakeup(struct rcu_data *rdp); -static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp); -static void rcu_spawn_cpu_nocb_kthread(int cpu); -static void __init rcu_spawn_nocb_kthreads(void); -#ifdef CONFIG_RCU_NOCB_CPU -static void __init rcu_organize_nocb_kthreads(void); -#endif /* #ifdef CONFIG_RCU_NOCB_CPU */ -static bool init_nocb_callback_list(struct rcu_data *rdp); -static unsigned long rcu_get_n_cbs_nocb_cpu(struct rcu_data *rdp); -static void rcu_bind_gp_kthread(void); -static bool rcu_nohz_full_cpu(void); -static void rcu_dynticks_task_enter(void); -static void rcu_dynticks_task_exit(void); - -/* Forward declarations for tree_stall.h */ -static void record_gp_stall_check_time(void); -static void rcu_iw_handler(struct irq_work *iwp); -static void check_cpu_stall(struct rcu_data *rdp); -static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp, - const unsigned long gpssdelay); + From patchwork Thu Jul 11 14:25:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040127 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 54248112C for ; Thu, 11 Jul 2019 14:28:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4421428A5A for ; Thu, 11 Jul 2019 14:28:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 375C428AC8; Thu, 11 Jul 2019 14:28:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D58D828A5A for ; Thu, 11 Jul 2019 14:28:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728869AbfGKO2D (ORCPT ); Thu, 11 Jul 2019 10:28:03 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37516 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728529AbfGKO2C (ORCPT ); Thu, 11 Jul 2019 10:28:02 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO8Kr001447; Thu, 11 Jul 2019 14:26:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=j5hNYpa6Ps4ofAY49xQlyVzas1I7FTDoeSXSCbctYus=; b=IZOylVirFWDOmZYSdGDrwFnJwm9CEHLxntE+KdhdKKVzqt1gSAfqZTTsVwqNsGzmUzgE PAj74VbhDOsq4BCutkFXYEtKMfHv6uDQN4EwPs/i8mhcZpVyCpdqZIKiMK3B0iniWbhc nCHcdwRFI9A6XWvL9fDPRjGkqCOObfWGxfS7XmG6Q9mdYcz4LapWwYkhkHzoQvWylY6M ysC+FaRK07kBZ2670+CaYM92TpCurpQy/wzZmTZFtJvwBPFzsFwFvWvwJU5DIkIsFNK3 zM5EMYBeH4z52ZPL8IQ1EjyFUcZnoz5RUscIoFGUEJ9sBLvHsPC7wF+/3bg5aRqA88yQ 5g== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0e1m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:44 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuB021444; Thu, 11 Jul 2019 14:26:41 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 18/26] rcu: Make percpu rcu_data non-static Date: Thu, 11 Jul 2019 16:25:30 +0200 Message-Id: <1562855138-19507-19-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Make percpu rcu_data non-static so that it can be mapped into an isolation address space page-table. This will allow address space isolation to use RCU without faulting. Signed-off-by: Alexandre Chartre --- kernel/rcu/tree.c | 2 +- kernel/rcu/tree.h | 1 + 2 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 44dd3b4..2827b2b 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -126,7 +126,7 @@ static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp, #define rcu_eqs_special_exit() do { } while (0) #endif -static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { +DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { .dynticks_nesting = 1, .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, .dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR), diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 9790b58..a043fde 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -394,3 +394,4 @@ struct rcu_state { int rcu_dynticks_snap(struct rcu_data *rdp); void call_rcu(struct rcu_head *head, rcu_callback_t func); +DECLARE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data); From patchwork Thu Jul 11 14:25:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040161 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A9CCF112C for ; Thu, 11 Jul 2019 14:29:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 993B528ABE for ; Thu, 11 Jul 2019 14:29:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8CFF328AD9; Thu, 11 Jul 2019 14:29:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37B6428ABE for ; Thu, 11 Jul 2019 14:29:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728943AbfGKO2Q (ORCPT ); Thu, 11 Jul 2019 10:28:16 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37756 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728926AbfGKO2P (ORCPT ); Thu, 11 Jul 2019 10:28:15 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO8vO001464; Thu, 11 Jul 2019 14:26:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=2d2Yv2y5zLW9fLZ4ZOO2tadSx6yS9UU2pD1uUkqP2I4=; b=PEK1VY+3UVMeYt+RB002bYINLbdCxWtdVaXXuLcUErNmdDVQeXpj0b9OMDpmCIqqZa4Y 9PvB0Lnb9b0uhxGmrMFXcR44+qa8LQlt1HNPX0Dv/WgAESNOrwyxaWcdUWdyw5aYvMo9 6hMEg+A/C1vRmGh2CVcZaeUmibNGRM+Z0ynXpbec7GOZxoWF9wbcV+Bk5OzwKInmLI0/ 2E7YCwLLpvnIvEy49dwDdr51/iqNr/GOk+NSxBuovD4OQOPclLfLK8Qy+EAEEUquuZ8L HADMT7wC0JoiunyhO1kAnDjpVZAnFUxpCSF1OfqJMoGYFOwX+PMYYSzDX1GbCZ5/xKmc PQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0e24-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:48 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuC021444; Thu, 11 Jul 2019 14:26:44 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 19/26] mm/asi: Add option to map RCU data Date: Thu, 11 Jul 2019 16:25:31 +0200 Message-Id: <1562855138-19507-20-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add an option to map RCU data when creating an ASI. This will map the percpu rcu_data (which is not exported by the kernel), and allow ASI to use RCU without faulting. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 1 + arch/x86/mm/asi.c | 4 ++++ 2 files changed, 5 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index a277e43..8199618 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -18,6 +18,7 @@ #define ASI_MAP_STACK_CANARY 0x01 /* map stack canary */ #define ASI_MAP_CPU_PTR 0x02 /* for get_cpu_var()/this_cpu_ptr() */ #define ASI_MAP_CURRENT_TASK 0x04 /* map the current task */ +#define ASI_MAP_RCU_DATA 0x08 /* map rcu data */ enum page_table_level { PGT_LEVEL_PTE, diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index acd1135..20c23dc 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -7,6 +7,7 @@ #include #include +#include #include #include #include @@ -16,6 +17,8 @@ #include #include +#include "../../../kernel/rcu/tree.h" + /* ASI sessions, one per cpu */ DEFINE_PER_CPU_PAGE_ALIGNED(struct asi_session, cpu_asi_session); @@ -29,6 +32,7 @@ struct asi_map_option asi_map_percpu_options[] = { { ASI_MAP_STACK_CANARY, &fixed_percpu_data, sizeof(fixed_percpu_data) }, { ASI_MAP_CPU_PTR, &this_cpu_off, sizeof(this_cpu_off) }, { ASI_MAP_CURRENT_TASK, ¤t_task, sizeof(current_task) }, + { ASI_MAP_RCU_DATA, &rcu_data, sizeof(rcu_data) }, }; static void asi_log_fault(struct asi *asi, struct pt_regs *regs, From patchwork Thu Jul 11 14:25:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040167 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 44668112C for ; Thu, 11 Jul 2019 14:29:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 34F6528AC8 for ; Thu, 11 Jul 2019 14:29:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 28F2928AD9; Thu, 11 Jul 2019 14:29:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D276C28AC8 for ; Thu, 11 Jul 2019 14:29:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728911AbfGKO2N (ORCPT ); Thu, 11 Jul 2019 10:28:13 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:42496 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728826AbfGKO2L (ORCPT ); Thu, 11 Jul 2019 10:28:11 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOMdI013631; Thu, 11 Jul 2019 14:26:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=9FGKxWOaR9lg8qKZvVxZYyBDHDZO4/U+GnNunz7EXSk=; b=lniqBkC9yuqD/Ju9/6Xyu42PxGGU6Yt/2fjCL1pA8uTadjROIw7CBcc8faiSFRo7Bv3q BPubFEelFsfRjyPHH37WhqoIXQX5UbWj/nD4TDzwZBHQoeubLcnl1AB5vQ/mCucdF+Yu 6nQIlGjLClEFr6VXl3jVjtuCT9T1LE8cucXjWWInMM2yXZ80VjgZXh9Wagaq6vop/Niu wJjXwR5Jbi7oMNqUbY65Is6pPs++Vxlbu62PVImCdj3jQ/wjgIZg0r9DuCe22wMVOgbF ph3T8g1h5qYYjiZctfTIEHMk2DM78dBxXxMV6ZGQQw9l9nE44jpFZXce/yhCQnDc6q9c 6g== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2tjm9r0bsr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:51 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuD021444; Thu, 11 Jul 2019 14:26:47 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 20/26] mm/asi: Add option to map cpu_hw_events Date: Thu, 11 Jul 2019 16:25:32 +0200 Message-Id: <1562855138-19507-21-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add option to map cpu_hw_events in ASI pagetable. Also restructure to select ptions for percpu optional mapping. Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 1 + arch/x86/mm/asi.c | 3 +++ 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index 8199618..f489551 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -19,6 +19,7 @@ #define ASI_MAP_CPU_PTR 0x02 /* for get_cpu_var()/this_cpu_ptr() */ #define ASI_MAP_CURRENT_TASK 0x04 /* map the current task */ #define ASI_MAP_RCU_DATA 0x08 /* map rcu data */ +#define ASI_MAP_CPU_HW_EVENTS 0x10 /* map cpu hw events */ enum page_table_level { PGT_LEVEL_PTE, diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index 20c23dc..d488704 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -17,6 +18,7 @@ #include #include +#include "../events/perf_event.h" #include "../../../kernel/rcu/tree.h" /* ASI sessions, one per cpu */ @@ -33,6 +35,7 @@ struct asi_map_option asi_map_percpu_options[] = { { ASI_MAP_CPU_PTR, &this_cpu_off, sizeof(this_cpu_off) }, { ASI_MAP_CURRENT_TASK, ¤t_task, sizeof(current_task) }, { ASI_MAP_RCU_DATA, &rcu_data, sizeof(rcu_data) }, + { ASI_MAP_CPU_HW_EVENTS, &cpu_hw_events, sizeof(cpu_hw_events) }, }; static void asi_log_fault(struct asi *asi, struct pt_regs *regs, From patchwork Thu Jul 11 14:25:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040153 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 954C61395 for ; Thu, 11 Jul 2019 14:29:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 84CC228ABE for ; Thu, 11 Jul 2019 14:29:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7736F28AD1; Thu, 11 Jul 2019 14:29:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0626328ABE for ; Thu, 11 Jul 2019 14:29:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729123AbfGKO3L (ORCPT ); Thu, 11 Jul 2019 10:29:11 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:42866 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728986AbfGKO20 (ORCPT ); Thu, 11 Jul 2019 10:28:26 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO7iI013253; Thu, 11 Jul 2019 14:26:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=/gc70sOJB+vK9w9Yd0bv1jRw7Ga4U07eYdWLQD2PoZs=; b=EUNz/kVVuwEftp1KU5qURSDwp+TGdX5vvQ5vTvYLS3DXTF1MuLYbqn75wFx41g9hyOGh RM9we7oEvKnvKWG/vpmQwbpaiu20SQ1RssVY5TTvMhrnQ8jKRj/CsRaQfpMKJY/qrm/1 LbhGJOplC+6BG/nJHq9sTPd5TIz1YACiQ9xWzDt1BtXTgliVimphNKiT1UU38OYP7oSW ELLA9E7UVvNKoLW2OQtdqbwZk0J9ufJIiTIIREI+9J5HCJcDTctrZalqmR+lSizxjXTr S4aV0Cb5jllRnc+3jSzNcfsprSQmpSqw2GObGMXoLB3Fng/zJt5MeVYf3wZ0eH8MRSjK eQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2tjm9r0btb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:26:54 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuE021444; Thu, 11 Jul 2019 14:26:50 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 21/26] mm/asi: Make functions to read cr3/cr4 ASI aware Date: Thu, 11 Jul 2019 16:25:33 +0200 Message-Id: <1562855138-19507-22-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When address space isolation is active, cpu_tlbstate isn't necessarily mapped in the ASI page-table, this would cause ASI to fault. Instead of just mapping cpu_tlbstate, update __get_current_cr3_fast() and cr4_read_shadow() by caching the cr3/cr4 values in the ASI session when ASI is active. Note that the cached cr3 value is the ASI cr3 value (i.e. the current CR3 value when ASI is active). The cached cr4 value is the cr4 value when isolation was entered (ASI doesn't change cr4). Signed-off-by: Alexandre Chartre --- arch/x86/include/asm/asi.h | 2 ++ arch/x86/include/asm/mmu_context.h | 20 ++++++++++++++++++-- arch/x86/include/asm/tlbflush.h | 10 ++++++++++ arch/x86/mm/asi.c | 3 +++ 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h index f489551..07c2b50 100644 --- a/arch/x86/include/asm/asi.h +++ b/arch/x86/include/asm/asi.h @@ -73,7 +73,9 @@ struct asi_session { enum asi_session_state state; /* state of ASI session */ bool retry_abort; /* always retry abort */ unsigned int abort_depth; /* abort depth */ + unsigned long isolation_cr3; /* cr3 when ASI is active */ unsigned long original_cr3; /* cr3 before entering ASI */ + unsigned long original_cr4; /* cr4 before entering ASI */ struct task_struct *task; /* task during isolation */ } __aligned(PAGE_SIZE); diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 9024236..8cec983 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -14,6 +14,7 @@ #include #include #include +#include extern atomic64_t last_mm_ctx_id; @@ -347,8 +348,23 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, */ static inline unsigned long __get_current_cr3_fast(void) { - unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, - this_cpu_read(cpu_tlbstate.loaded_mm_asid)); + unsigned long cr3; + +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + /* + * If isolation is active, cpu_tlbstate isn't necessarily mapped + * in the ASI page-table (and it doesn't have the current pgd anyway). + * The current CR3 is cached in the CPU ASI session. + */ + if (this_cpu_read(cpu_asi_session.state) == ASI_SESSION_STATE_ACTIVE) + cr3 = this_cpu_read(cpu_asi_session.isolation_cr3); + else + cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, + this_cpu_read(cpu_tlbstate.loaded_mm_asid)); +#else + cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, + this_cpu_read(cpu_tlbstate.loaded_mm_asid)); +#endif /* For now, be very restrictive about when this can be called. */ VM_WARN_ON(in_nmi() || preemptible()); diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index dee3758..917f9a5 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -12,6 +12,7 @@ #include #include #include +#include /* * The x86 feature is called PCID (Process Context IDentifier). It is similar @@ -324,6 +325,15 @@ static inline void cr4_toggle_bits_irqsoff(unsigned long mask) /* Read the CR4 shadow. */ static inline unsigned long cr4_read_shadow(void) { +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + /* + * If isolation is active, cpu_tlbstate isn't necessarily mapped + * in the ASI page-table. The CR4 value is cached in the CPU + * ASI session. + */ + if (this_cpu_read(cpu_asi_session.state) == ASI_SESSION_STATE_ACTIVE) + return this_cpu_read(cpu_asi_session.original_cr4); +#endif return this_cpu_read(cpu_tlbstate.cr4); } diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c index d488704..4a5a4ba 100644 --- a/arch/x86/mm/asi.c +++ b/arch/x86/mm/asi.c @@ -23,6 +23,7 @@ /* ASI sessions, one per cpu */ DEFINE_PER_CPU_PAGE_ALIGNED(struct asi_session, cpu_asi_session); +EXPORT_SYMBOL(cpu_asi_session); struct asi_map_option { int flag; @@ -291,6 +292,8 @@ int asi_enter(struct asi *asi) goto err_unmap_task; } asi_session->original_cr3 = original_cr3; + asi_session->original_cr4 = cr4_read_shadow(); + asi_session->isolation_cr3 = __sme_pa(asi->pgd); /* * Use ASI barrier as we are setting CR3 with the ASI page-table. From patchwork Thu Jul 11 14:25:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040159 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C50A51395 for ; Thu, 11 Jul 2019 14:29:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B621128ABE for ; Thu, 11 Jul 2019 14:29:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA3DD28AD1; Thu, 11 Jul 2019 14:29:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5108028ABE for ; Thu, 11 Jul 2019 14:29:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728952AbfGKO2R (ORCPT ); Thu, 11 Jul 2019 10:28:17 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:39982 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728928AbfGKO2Q (ORCPT ); Thu, 11 Jul 2019 10:28:16 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOGvb100511; Thu, 11 Jul 2019 14:27:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=R7Xfip06p699muIVGHbRVSpcUyRHanu29pNDIpaJD54=; b=LYEr+9/uvSyvI7OVyk9UP5FJcmH4YzsINobd9Ws4d1Sn7zQRqsU665EFWo2tRiz/cBWH TQPi+azMpPQ50ca6ZxpYu/g0a8JuwQaUyBMwUTLzLIprNNem/R910eIUU8P5Bv98N3Zz VQLKGIpKXFE80COh/4g3j2DCRhhYcgvjn2KBcjXSAHnxcqRegfAAZE4hnlxbfYZgQOEd YTBZwZfCqCbfS+eDaW7yYwGctCL21NUDe7ydmDXBTvwj7gQ1BR2oLXZZ5z2nmPV3Av6O uW3d4XWaCBCFsg3LcCI9MxRw/Tgth3Gipeew1Xr8UIl8LL6WLVneXw3ZkgDZG2ZEH0Xh 4A== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0cdp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:27:02 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuF021444; Thu, 11 Jul 2019 14:26:53 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 22/26] KVM: x86/asi: Introduce address_space_isolation module parameter Date: Thu, 11 Jul 2019 16:25:34 +0200 Message-Id: <1562855138-19507-23-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Liran Alon Add the address_space_isolation parameter to the kvm module. When set to true, KVM #VMExit handlers run in isolated address space which maps only KVM required code and per-VM information instead of entire kernel address space. This mechanism is meant to mitigate memory-leak side-channels CPU vulnerabilities (e.g. Spectre, L1TF and etc.) but can also be viewed as security in-depth as it also helps generically against info-leaks vulnerabilities in KVM #VMExit handlers and reduce the available gadgets for ROP attacks. This is set to false by default because it incurs a performance hit which some users will not want to take for security gain. Signed-off-by: Liran Alon Signed-off-by: Alexandre Chartre --- arch/x86/kvm/Makefile | 3 ++- arch/x86/kvm/vmx/isolation.c | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+), 1 deletions(-) create mode 100644 arch/x86/kvm/vmx/isolation.c diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 31ecf7a..71579ed 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -12,7 +12,8 @@ kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \ i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \ hyperv.o page_track.o debugfs.o -kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o vmx/evmcs.o vmx/nested.o +kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \ + vmx/evmcs.o vmx/nested.o vmx/isolation.o kvm-amd-y += svm.o pmu_amd.o obj-$(CONFIG_KVM) += kvm.o diff --git a/arch/x86/kvm/vmx/isolation.c b/arch/x86/kvm/vmx/isolation.c new file mode 100644 index 0000000..e25f663 --- /dev/null +++ b/arch/x86/kvm/vmx/isolation.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2019, Oracle and/or its affiliates. All rights reserved. + * + * KVM Address Space Isolation + */ + +#include +#include + +/* + * When set to true, KVM #VMExit handlers run in isolated address space + * which maps only KVM required code and per-VM information instead of + * entire kernel address space. + * + * This mechanism is meant to mitigate memory-leak side-channels CPU + * vulnerabilities (e.g. Spectre, L1TF and etc.) but can also be viewed + * as security in-depth as it also helps generically against info-leaks + * vulnerabilities in KVM #VMExit handlers and reduce the available + * gadgets for ROP attacks. + * + * This is set to false by default because it incurs a performance hit + * which some users will not want to take for security gain. + */ +static bool __read_mostly address_space_isolation; +module_param(address_space_isolation, bool, 0444); From patchwork Thu Jul 11 14:25:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040129 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0BE26112C for ; Thu, 11 Jul 2019 14:28:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F0B9C28A5A for ; Thu, 11 Jul 2019 14:28:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E45E028AC8; Thu, 11 Jul 2019 14:28:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7655E28A5A for ; Thu, 11 Jul 2019 14:28:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728902AbfGKO2L (ORCPT ); Thu, 11 Jul 2019 10:28:11 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37630 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728889AbfGKO2J (ORCPT ); Thu, 11 Jul 2019 10:28:09 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO8MS001456; Thu, 11 Jul 2019 14:27:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=0gU3+EHAnLmrdQwDq6KHZSyy6MdJI7eVGdpzzFAPgXk=; b=iiyyes3p413lD4tawJspczGrP6E0lodllk7rkaTrmsT9/Hc2gRBbnS6pJ51rtyC6w6oK s0zXzPmMoMMZfu+NfOM5wgMpW+ZwZb5dNJIjL1sOQ2wAe8I6VnoBZVNNsVBlEfHguvjZ UWzv4mgQ5Z/QbJM2eHak92dK2mAAiscm2gFOv4AEAAm/6hM07TfZIH/hBOE3puhP7Woq 7qMIQ3AxCkBa3FdCnuUeCDTF3Vl7BFaFDKI8v0ZHZA9v/ppmgWW2qg65hjSCEFqy4KMB vWZ/Kr+eO2oRgHWyu5lBLP3blCrBHPO1SGFrwvoYGRq8pBU17+caS24D540bGjrkqGJ+ fA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0e4s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:27:05 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuG021444; Thu, 11 Jul 2019 14:26:57 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 23/26] KVM: x86/asi: Introduce KVM address space isolation Date: Thu, 11 Jul 2019 16:25:35 +0200 Message-Id: <1562855138-19507-24-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Liran Alon Create a separate address space for KVM that will be active when KVM #VMExit handlers run. Up until the point which we architectully need to access host (or other VM) sensitive data. This patch just create the address space using address space isolation (asi) but never makes it active yet. This will be done by next commits. Signed-off-by: Liran Alon Signed-off-by: Alexandre Chartre --- arch/x86/kvm/vmx/isolation.c | 58 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 7 ++++- arch/x86/kvm/vmx/vmx.h | 3 ++ include/linux/kvm_host.h | 5 +++ 4 files changed, 72 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/vmx/isolation.c b/arch/x86/kvm/vmx/isolation.c index e25f663..644d8d3 100644 --- a/arch/x86/kvm/vmx/isolation.c +++ b/arch/x86/kvm/vmx/isolation.c @@ -7,6 +7,15 @@ #include #include +#include +#include +#include + +#include "vmx.h" +#include "x86.h" + +#define VMX_ASI_MAP_FLAGS \ + (ASI_MAP_STACK_CANARY | ASI_MAP_CPU_PTR | ASI_MAP_CURRENT_TASK) /* * When set to true, KVM #VMExit handlers run in isolated address space @@ -24,3 +33,52 @@ */ static bool __read_mostly address_space_isolation; module_param(address_space_isolation, bool, 0444); + +static int vmx_isolation_init_mapping(struct asi *asi, struct vcpu_vmx *vmx) +{ + /* TODO: Populate the KVM ASI page-table */ + + return 0; +} + +int vmx_isolation_init(struct vcpu_vmx *vmx) +{ + struct kvm_vcpu *vcpu = &vmx->vcpu; + struct asi *asi; + int err; + + if (!address_space_isolation) { + vcpu->asi = NULL; + return 0; + } + + asi = asi_create(VMX_ASI_MAP_FLAGS); + if (!asi) { + pr_debug("KVM: x86: Failed to create address space isolation\n"); + return -ENXIO; + } + + err = vmx_isolation_init_mapping(asi, vmx); + if (err) { + vcpu->asi = NULL; + return err; + } + + vcpu->asi = asi; + + pr_info("KVM: x86: Running with isolated address space\n"); + + return 0; +} + +void vmx_isolation_uninit(struct vcpu_vmx *vmx) +{ + struct kvm_vcpu *vcpu = &vmx->vcpu; + + if (!address_space_isolation || !vcpu->asi) + return; + + asi_destroy(vcpu->asi); + vcpu->asi = NULL; + pr_info("KVM: x86: End of isolated address space\n"); +} diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d98eac3..9b92467 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -202,7 +202,7 @@ }; #define L1D_CACHE_ORDER 4 -static void *vmx_l1d_flush_pages; +void *vmx_l1d_flush_pages; static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) { @@ -6561,6 +6561,7 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); + vmx_isolation_uninit(vmx); if (enable_pml) vmx_destroy_pml_buffer(vmx); free_vpid(vmx->vpid); @@ -6672,6 +6673,10 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) vmx->ept_pointer = INVALID_PAGE; + err = vmx_isolation_init(vmx); + if (err) + goto free_vmcs; + return &vmx->vcpu; free_vmcs: diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 61128b4..09c1593 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -525,4 +525,7 @@ static inline void decache_tsc_multiplier(struct vcpu_vmx *vmx) void dump_vmcs(void); +int vmx_isolation_init(struct vcpu_vmx *vmx); +void vmx_isolation_uninit(struct vcpu_vmx *vmx); + #endif /* __KVM_X86_VMX_H */ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d1ad38a..2a9d073 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -34,6 +34,7 @@ #include #include +#include #ifndef KVM_MAX_VCPU_ID #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS @@ -320,6 +321,10 @@ struct kvm_vcpu { bool preempted; struct kvm_vcpu_arch arch; struct dentry *debugfs_dentry; + +#ifdef CONFIG_ADDRESS_SPACE_ISOLATION + struct asi *asi; +#endif }; static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) From patchwork Thu Jul 11 14:25:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040131 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A3AE1395 for ; Thu, 11 Jul 2019 14:28:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 660DD28A5A for ; Thu, 11 Jul 2019 14:28:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5633E28AC8; Thu, 11 Jul 2019 14:28:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D377A28A5A for ; Thu, 11 Jul 2019 14:28:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728992AbfGKO20 (ORCPT ); Thu, 11 Jul 2019 10:28:26 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37904 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728976AbfGKO2Y (ORCPT ); Thu, 11 Jul 2019 10:28:24 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO8ft001446; Thu, 11 Jul 2019 14:27:04 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=44bGpNhDn4CQMJWfcA7dlBtv4lKR0GVNyXSLf3OmODU=; b=sYC68pAgGYMeZCzOFkUOuj4EApQvKlshFt7mE2WAdPKV4oXjKlwvx1k4YJanPkydZ+Rn nhL5rnoahTfhc8IS1gDxcvz2TkFDuiQbUpBXbrUM3d7hA1hqNT1LXCfxGliFZOD6TdOF /3LBjw+5bGaWzvBtKCMuwVIUzbBatm/N7BwPJzQJCnQfj9Csi3AeniX/ps6uYP+y0BkL obw32Tp/dAVmexQz4kk3ptW1s2ArR2RXMrCD7YlTDm2ZoFxL1jsctlJFVVEs2mdJMt1K kWQize+PYuE3+4ce/ZZAOApqTAW4neDTZZCWGAKbNKb/yPwlRIT1dU+LMDkfey1//U/x dQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0e4a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:27:03 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuH021444; Thu, 11 Jul 2019 14:27:00 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 24/26] KVM: x86/asi: Populate the KVM ASI page-table Date: Thu, 11 Jul 2019 16:25:36 +0200 Message-Id: <1562855138-19507-25-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add mappings to the KVM ASI page-table so that KVM can run with its address space isolation without faulting too much. Signed-off-by: Alexandre Chartre --- arch/x86/kvm/vmx/isolation.c | 155 ++++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 1 - arch/x86/kvm/vmx/vmx.h | 3 + 3 files changed, 154 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/isolation.c b/arch/x86/kvm/vmx/isolation.c index 644d8d3..d82f6b6 100644 --- a/arch/x86/kvm/vmx/isolation.c +++ b/arch/x86/kvm/vmx/isolation.c @@ -5,7 +5,7 @@ * KVM Address Space Isolation */ -#include +#include #include #include #include @@ -14,8 +14,11 @@ #include "vmx.h" #include "x86.h" -#define VMX_ASI_MAP_FLAGS \ - (ASI_MAP_STACK_CANARY | ASI_MAP_CPU_PTR | ASI_MAP_CURRENT_TASK) +#define VMX_ASI_MAP_FLAGS (ASI_MAP_STACK_CANARY | \ + ASI_MAP_CPU_PTR | \ + ASI_MAP_CURRENT_TASK | \ + ASI_MAP_RCU_DATA | \ + ASI_MAP_CPU_HW_EVENTS) /* * When set to true, KVM #VMExit handlers run in isolated address space @@ -34,9 +37,153 @@ static bool __read_mostly address_space_isolation; module_param(address_space_isolation, bool, 0444); +/* + * Map various kernel data. + */ +static int vmx_isolation_map_kernel_data(struct asi *asi) +{ + int err; + + /* map context_tracking, used by guest_enter_irqoff() */ + err = ASI_MAP_CPUVAR(asi, context_tracking); + if (err) + return err; + + /* map irq_stat, used by kvm_*_cpu_l1tf_flush_l1d */ + err = ASI_MAP_CPUVAR(asi, irq_stat); + if (err) + return err; + return 0; +} + +/* + * Map kvm module and data from that module. + */ +static int vmx_isolation_map_kvm_data(struct asi *asi, struct kvm *kvm) +{ + int err; + + /* map kvm module */ + err = asi_map_module(asi, "kvm"); + if (err) + return err; + + err = asi_map_percpu(asi, kvm->srcu.sda, + sizeof(struct srcu_data)); + if (err) + return err; + + return 0; +} + +/* + * Map kvm-intel module and generic x86 data. + */ +static int vmx_isolation_map_kvm_x86_data(struct asi *asi) +{ + int err; + + /* map current module (kvm-intel) */ + err = ASI_MAP_THIS_MODULE(asi); + if (err) + return err; + + /* map current_vcpu, used by vcpu_enter_guest() */ + err = ASI_MAP_CPUVAR(asi, current_vcpu); + if (err) + return (err); + + return 0; +} + +/* + * Map vmx data. + */ +static int vmx_isolation_map_kvm_vmx_data(struct asi *asi, struct vcpu_vmx *vmx) +{ + struct kvm_vmx *kvm_vmx; + struct kvm_vcpu *vcpu; + struct kvm *kvm; + int err; + + vcpu = &vmx->vcpu; + kvm = vcpu->kvm; + kvm_vmx = to_kvm_vmx(kvm); + + /* map kvm_vmx (this also maps kvm) */ + err = asi_map(asi, kvm_vmx, sizeof(*kvm_vmx)); + if (err) + return err; + + /* map vmx (this also maps vcpu) */ + err = asi_map(asi, vmx, sizeof(*vmx)); + if (err) + return err; + + /* map vcpu data */ + err = asi_map(asi, vcpu->run, PAGE_SIZE); + if (err) + return err; + + err = asi_map(asi, vcpu->arch.apic, sizeof(struct kvm_lapic)); + if (err) + return err; + + /* + * Map additional vmx data. + */ + + if (vmx_l1d_flush_pages) { + err = asi_map(asi, vmx_l1d_flush_pages, + PAGE_SIZE << L1D_CACHE_ORDER); + if (err) + return err; + } + + if (enable_pml) { + err = asi_map(asi, vmx->pml_pg, sizeof(struct page)); + if (err) + return err; + } + + err = asi_map(asi, vmx->guest_msrs, PAGE_SIZE); + if (err) + return err; + + err = asi_map(asi, vmx->vmcs01.vmcs, PAGE_SIZE << vmcs_config.order); + if (err) + return err; + + err = asi_map(asi, vmx->vmcs01.msr_bitmap, PAGE_SIZE); + if (err) + return err; + + err = asi_map(asi, vmx->vcpu.arch.pio_data, PAGE_SIZE); + if (err) + return err; + + return 0; +} + static int vmx_isolation_init_mapping(struct asi *asi, struct vcpu_vmx *vmx) { - /* TODO: Populate the KVM ASI page-table */ + int err; + + err = vmx_isolation_map_kernel_data(asi); + if (err) + return err; + + err = vmx_isolation_map_kvm_data(asi, vmx->vcpu.kvm); + if (err) + return err; + + err = vmx_isolation_map_kvm_x86_data(asi); + if (err) + return err; + + err = vmx_isolation_map_kvm_vmx_data(asi, vmx); + if (err) + return err; return 0; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9b92467..d47f093 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -201,7 +201,6 @@ [VMENTER_L1D_FLUSH_NOT_REQUIRED] = {"not required", false}, }; -#define L1D_CACHE_ORDER 4 void *vmx_l1d_flush_pages; static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 09c1593..e8de23b 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -11,6 +11,9 @@ #include "ops.h" #include "vmcs.h" +#define L1D_CACHE_ORDER 4 +extern void *vmx_l1d_flush_pages; + extern const u32 vmx_msr_index[]; extern u64 host_efer; From patchwork Thu Jul 11 14:25:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040133 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72FB31395 for ; Thu, 11 Jul 2019 14:28:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 641B628A5A for ; Thu, 11 Jul 2019 14:28:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5860E28AC8; Thu, 11 Jul 2019 14:28:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CFE8728A5A for ; Thu, 11 Jul 2019 14:28:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728926AbfGKO22 (ORCPT ); Thu, 11 Jul 2019 10:28:28 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:40236 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728987AbfGKO20 (ORCPT ); Thu, 11 Jul 2019 10:28:26 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEOeBP100905; Thu, 11 Jul 2019 14:27:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=fCmddW3R3yJKPuzx/XlLHeQxEVYNN8yCIHDije5QoA8=; b=Goc71bNby+SDwxhZAZJVL35O0KCgSN6SQ19s01XeIVYA0KpDzxqCOIh+DsMUdWnvypT6 gGIgJRxy1708M9KVMhnYoB2ywb6qTtvwn4CXAcD7cmjlrvcJBS5+6PMDrIoeoEOtcDJK KW4X5dXci15RHrRtDaZcLfMFnXMXtBJVZEJjoExPc4mdpEdW63wB2JO1A+Dh4bIcTLp9 33On0i+8xG4PIxV+nRWqZqXbmizbh6ObtA00K1qWRZ6jEfJsKpM/X75UoB221Hphl68n Yka3rbKbML0ZOqcFcBNFej4+/AWOTKnp/kUU6rp69SbIMOLJvaM5ku0lYaGL5zwa2anP Lg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2tjkkq0ceg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:27:06 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuI021444; Thu, 11 Jul 2019 14:27:03 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 25/26] KVM: x86/asi: Switch to KVM address space on entry to guest Date: Thu, 11 Jul 2019 16:25:37 +0200 Message-Id: <1562855138-19507-26-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=753 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Liran Alon Switch to KVM address space on entry to guest. Most of KVM #VMExit handlers will run in KVM isolated address space and switch back to host address space only before accessing sensitive data. Sensitive data is defined as either host data or other VM data. Currently, we switch back to the host address space on the following scenarios: 1) When handling guest page-faults: As this will access SPTs which contains host PFNs. 2) On schedule-out of vCPU thread 3) On write to guest virtual memory (kvm_write_guest_virt_system() can pull in tons of pages) 4) On return to userspace (e.g. QEMU) 5) On interrupt or exception Signed-off-by: Liran Alon Signed-off-by: Alexandre Chartre --- arch/x86/kvm/mmu.c | 2 +- arch/x86/kvm/vmx/isolation.c | 2 +- arch/x86/kvm/vmx/vmx.c | 6 ++++++ arch/x86/kvm/vmx/vmx.h | 18 ++++++++++++++++++ arch/x86/kvm/x86.c | 34 +++++++++++++++++++++++++++++++++- arch/x86/kvm/x86.h | 1 + 6 files changed, 60 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 98f6e4f..298f602 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4067,7 +4067,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, { int r = 1; - vcpu->arch.l1tf_flush_l1d = true; + kvm_may_access_sensitive_data(vcpu); switch (vcpu->arch.apf.host_apf_reason) { default: trace_kvm_page_fault(fault_address, error_code); diff --git a/arch/x86/kvm/vmx/isolation.c b/arch/x86/kvm/vmx/isolation.c index d82f6b6..8f57f10 100644 --- a/arch/x86/kvm/vmx/isolation.c +++ b/arch/x86/kvm/vmx/isolation.c @@ -34,7 +34,7 @@ * This is set to false by default because it incurs a performance hit * which some users will not want to take for security gain. */ -static bool __read_mostly address_space_isolation; +bool __read_mostly address_space_isolation; module_param(address_space_isolation, bool, 0444); /* diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d47f093..b5867cc 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6458,8 +6458,14 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) if (vcpu->arch.cr2 != read_cr2()) write_cr2(vcpu->arch.cr2); + /* + * Use an isolation barrier as VMExit will restore the isolation + * CR3 while interrupts can abort isolation. + */ + vmx_isolation_barrier_begin(vmx); vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs, vmx->loaded_vmcs->launched); + vmx_isolation_barrier_end(vmx); vcpu->arch.cr2 = read_cr2(); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index e8de23b..b65f059 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -531,4 +531,22 @@ static inline void decache_tsc_multiplier(struct vcpu_vmx *vmx) int vmx_isolation_init(struct vcpu_vmx *vmx); void vmx_isolation_uninit(struct vcpu_vmx *vmx); +extern bool __read_mostly address_space_isolation; + +static inline void vmx_isolation_barrier_begin(struct vcpu_vmx *vmx) +{ + if (!address_space_isolation || !vmx->vcpu.asi) + return; + + asi_barrier_begin(); +} + +static inline void vmx_isolation_barrier_end(struct vcpu_vmx *vmx) +{ + if (!address_space_isolation || !vmx->vcpu.asi) + return; + + asi_barrier_end(); +} + #endif /* __KVM_X86_VMX_H */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9857992..9458413 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3346,6 +3346,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) * guest. do_debug expects dr6 to be cleared after it runs, do the same. */ set_debugreg(0, 6); + + kvm_may_access_sensitive_data(vcpu); } static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu, @@ -5259,7 +5261,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception) { /* kvm_write_guest_virt_system can pull in tons of pages. */ - vcpu->arch.l1tf_flush_l1d = true; + kvm_may_access_sensitive_data(vcpu); return kvm_write_guest_virt_helper(addr, val, bytes, vcpu, PFERR_WRITE_MASK, exception); @@ -7744,6 +7746,32 @@ void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(__kvm_request_immediate_exit); +static void vcpu_isolation_enter(struct kvm_vcpu *vcpu) +{ + int err; + + if (!vcpu->asi) + return; + + err = asi_enter(vcpu->asi); + if (err) + pr_debug("KVM isolation failed: error %d\n", err); +} + +static void vcpu_isolation_exit(struct kvm_vcpu *vcpu) +{ + if (!vcpu->asi) + return; + + asi_exit(vcpu->asi); +} + +void kvm_may_access_sensitive_data(struct kvm_vcpu *vcpu) +{ + vcpu->arch.l1tf_flush_l1d = true; + vcpu_isolation_exit(vcpu); +} + /* * Returns 1 to let vcpu_run() continue the guest execution loop without * exiting to the userspace. Otherwise, the value will be returned to the @@ -7944,6 +7972,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) goto cancel_injection; } + vcpu_isolation_enter(vcpu); + if (req_immediate_exit) { kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_x86_ops->request_immediate_exit(vcpu); @@ -8130,6 +8160,8 @@ static int vcpu_run(struct kvm_vcpu *vcpu) srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); + kvm_may_access_sensitive_data(vcpu); + return r; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index a470ff0..69a7402 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -356,5 +356,6 @@ static inline bool kvm_pat_valid(u64 data) void kvm_load_guest_xcr0(struct kvm_vcpu *vcpu); void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu); +void kvm_may_access_sensitive_data(struct kvm_vcpu *vcpu); #endif From patchwork Thu Jul 11 14:25:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Chartre X-Patchwork-Id: 11040165 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A35671395 for ; Thu, 11 Jul 2019 14:29:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9374F28ABE for ; Thu, 11 Jul 2019 14:29:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 84C4E28AD1; Thu, 11 Jul 2019 14:29:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 343DF28ABE for ; Thu, 11 Jul 2019 14:29:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728789AbfGKO3g (ORCPT ); Thu, 11 Jul 2019 10:29:36 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:39164 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727612AbfGKO3g (ORCPT ); Thu, 11 Jul 2019 10:29:36 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x6BEO8vQ001464; Thu, 11 Jul 2019 14:27:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=p0TwseF6Rl2ptej9BTwWkpZCi9Mm6El6sXWBWxSv9Wc=; b=VxnBJQPSh0LmxFyWqVND6G3imbrgiNDc2XYktPKhPxTH7jeK12iQ07uhiR7KSkYssaXb Pv64TstUQ6TZGJ1PMaSuITPpNq63mpZXM6Iv7MEPC/m1XIUd6uAVwnuCuLzWwEkJpk+d bLhzpPXWzEGxFhxH1pYNQvLU4rahHXqpvu4/IcVK0tDJMHq3A7NSNALQrgdnEe+A5MWw HjWBvSrm/StSA0wUZr133cDwxkrZIL+2nTAatg2QWIUq+pis7baTZiMjbcwkfcdkdyAy 8SJFtfEFVLR+g5pFWeVErgqD0yalSI8+c1PK8OfF5ETzFY1qWu5XfRWDWSejjFNNjfXr 8Q== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2130.oracle.com with ESMTP id 2tjk2u0e5t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Jul 2019 14:27:15 +0000 Received: from achartre-desktop.fr.oracle.com (dhcp-10-166-106-34.fr.oracle.com [10.166.106.34]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x6BEPcuJ021444; Thu, 11 Jul 2019 14:27:06 GMT From: Alexandre Chartre To: pbonzini@redhat.com, rkrcmar@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com, alexandre.chartre@oracle.com Subject: [RFC v2 26/26] KVM: x86/asi: Map KVM memslots and IO buses into KVM ASI Date: Thu, 11 Jul 2019 16:25:38 +0200 Message-Id: <1562855138-19507-27-git-send-email-alexandre.chartre@oracle.com> X-Mailer: git-send-email 1.7.1 In-Reply-To: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9314 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1907110162 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Map KVM memslots and IO buses into KVM ASI. Mapping is checking on each KVM ASI enter because they can change. Signed-off-by: Alexandre Chartre --- arch/x86/kvm/x86.c | 36 +++++++++++++++++++++++++++++++++++- include/linux/kvm_host.h | 2 ++ 2 files changed, 37 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9458413..7c52827 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7748,11 +7748,45 @@ void __kvm_request_immediate_exit(struct kvm_vcpu *vcpu) static void vcpu_isolation_enter(struct kvm_vcpu *vcpu) { - int err; + struct kvm *kvm = vcpu->kvm; + struct kvm_io_bus *bus; + int i, err; if (!vcpu->asi) return; + /* + * Check memslots and buses mapping as they tend to change. + */ + for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { + if (vcpu->asi_memslots[i] == kvm->memslots[i]) + continue; + pr_debug("remapping kvm memslots[%d]: %px -> %px\n", + i, vcpu->asi_memslots[i], kvm->memslots[i]); + err = asi_remap(vcpu->asi, &vcpu->asi_memslots[i], + kvm->memslots[i], sizeof(struct kvm_memslots)); + if (err) { + pr_debug("failed to map kvm memslots[%d]: error %d\n", + i, err); + } + } + + + for (i = 0; i < KVM_NR_BUSES; i++) { + bus = kvm->buses[i]; + if (bus == vcpu->asi_buses[i]) + continue; + pr_debug("remapped kvm buses[%d]: %px -> %px\n", + i, vcpu->asi_buses[i], bus); + err = asi_remap(vcpu->asi, &vcpu->asi_buses[i], bus, + sizeof(*bus) + bus->dev_count * + sizeof(struct kvm_io_range)); + if (err) { + pr_debug("failed to map kvm buses[%d]: error %d\n", + i, err); + } + } + err = asi_enter(vcpu->asi); if (err) pr_debug("KVM isolation failed: error %d\n", err); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 2a9d073..1f82de4 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -324,6 +324,8 @@ struct kvm_vcpu { #ifdef CONFIG_ADDRESS_SPACE_ISOLATION struct asi *asi; + void *asi_memslots[KVM_ADDRESS_SPACE_NUM]; + void *asi_buses[KVM_NR_BUSES]; #endif };