From patchwork Thu Sep 19 15:03:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152795 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27CC414ED for ; Thu, 19 Sep 2019 15:11:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 10C8C2196F for ; Thu, 19 Sep 2019 15:11:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391099AbfISPLI (ORCPT ); Thu, 19 Sep 2019 11:11:08 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50056 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732034AbfISPJv (ORCPT ); Thu, 19 Sep 2019 11:09:51 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy3w-0006n4-LB; Thu, 19 Sep 2019 17:09:40 +0200 Message-Id: <20190919150808.521907403@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:15 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 01/15] entry: Provide generic syscall entry functionality References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On syscall entry certain work needs to be done conditionally like tracing, seccomp etc. This code is duplicated in all architectures. Provide a generic version. Signed-off-by: Thomas Gleixner --- arch/Kconfig | 3 + include/linux/entry-common.h | 122 +++++++++++++++++++++++++++++++++++++++++++ kernel/Makefile | 1 kernel/entry/Makefile | 3 + kernel/entry/common.c | 33 +++++++++++ 5 files changed, 162 insertions(+) --- a/arch/Kconfig +++ b/arch/Kconfig @@ -27,6 +27,9 @@ config HAVE_IMA_KEXEC config HOTPLUG_SMT bool +config GENERIC_ENTRY + bool + config OPROFILE tristate "OProfile system profiling" depends on PROFILING --- /dev/null +++ b/include/linux/entry-common.h @@ -0,0 +1,122 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_ENTRYCOMMON_H +#define __LINUX_ENTRYCOMMON_H + +#include +#include +#include +#include +#include + +#include + +/* + * Define dummy _TIF work flags if not defined by the architecture or for + * disabled functionality. + */ +#ifndef _TIF_SYSCALL_TRACE +# define _TIF_SYSCALL_TRACE (0) +#endif + +#ifndef _TIF_SYSCALL_EMU +# define _TIF_SYSCALL_EMU (0) +#endif + +#ifndef _TIF_SYSCALL_TRACEPOINT +# define _TIF_SYSCALL_TRACEPOINT (0) +#endif + +#ifndef _TIF_SECCOMP +# define _TIF_SECCOMP (0) +#endif + +#ifndef _TIF_AUDIT +# define _TIF_AUDIT (0) +#endif + +/* + * TIF flags handled in syscall_enter_from_usermode() + */ +#ifndef ARCH_SYSCALL_ENTER_WORK +# define ARCH_SYSCALL_ENTER_WORK (0) +#endif + +#define SYSCALL_ENTER_WORK \ + (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | TIF_SECCOMP | \ + _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_EMU | \ + ARCH_SYSCALL_ENTER_WORK) + +/** + * arch_syscall_enter_tracehook - Wrapper around tracehook_report_syscall_entry() + * + * Defaults to tracehook_report_syscall_entry(). Can be replaced by + * architecture specific code. + * + * Invoked from syscall_enter_from_usermode() + */ +static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs); + +#ifndef arch_syscall_enter_tracehook +static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs) +{ + return tracehook_report_syscall_entry(regs); +} +#endif + +/** + * arch_syscall_enter_seccomp - Architecture specific seccomp invocation + * @regs: Pointer to currents pt_regs + * + * Invoked from syscall_enter_from_usermode(). Can be replaced by + * architecture specific code. + */ +static inline long arch_syscall_enter_seccomp(struct pt_regs *regs); + +#ifndef arch_syscall_enter_seccomp +static inline long arch_syscall_enter_seccomp(struct pt_regs *regs) +{ + return secure_computing(NULL); +} +#endif + +/** + * arch_syscall_enter_audit - Architecture specific audit invocation + * @regs: Pointer to currents pt_regs + * + * Invoked from syscall_enter_from_usermode(). Must be replaced by + * architecture specific code if the architecture supports audit. + */ +static inline void arch_syscall_enter_audit(struct pt_regs *regs); + +#ifndef arch_syscall_enter_audit +static inline void arch_syscall_enter_audit(struct pt_regs *regs) { } +#endif + +/* Common syscall enter function */ +long core_syscall_enter_from_usermode(struct pt_regs *regs, long syscall); + +/** + * syscall_enter_from_usermode - Check and handle work before invoking + * a syscall + * @regs: Pointer to currents pt_regs + * @syscall: The syscall number + * + * Invoked from architecture specific syscall entry code with interrupts + * enabled. + * + * Returns: The original or a modified syscall number + */ +static inline long syscall_enter_from_usermode(struct pt_regs *regs, + long syscall) +{ + unsigned long ti_work = READ_ONCE(current_thread_info()->flags); + + if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) + BUG_ON(regs != task_pt_regs(current)); + + if (ti_work & SYSCALL_ENTER_WORK) + syscall = core_syscall_enter_from_usermode(regs, syscall); + return syscall; +} + +#endif --- a/kernel/Makefile +++ b/kernel/Makefile @@ -43,6 +43,7 @@ obj-y += irq/ obj-y += rcu/ obj-y += livepatch/ obj-y += dma/ +obj-y += entry/ obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o obj-$(CONFIG_FREEZER) += freezer.o --- /dev/null +++ b/kernel/entry/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_GENERIC_ENTRY) += common.o --- /dev/null +++ b/kernel/entry/common.c @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +#define CREATE_TRACE_POINTS +#include + +long core_syscall_enter_from_usermode(struct pt_regs *regs, long syscall) +{ + unsigned long ti_work = READ_ONCE(current_thread_info()->flags); + unsigned long ret = 0; + + if (ti_work & (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU)) { + ret = arch_syscall_enter_tracehook(regs); + if (ret || (ti_work & _TIF_SYSCALL_EMU)) + return -1L; + } + + /* Do seccomp after ptrace, to catch any tracer changes. */ + if (ti_work & _TIF_SECCOMP) { + ret = arch_syscall_enter_seccomp(regs); + if (ret == -1L) + return ret; + } + + if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT))) + trace_sys_enter(regs, syscall); + + arch_syscall_enter_audit(regs); + + return ret ? : syscall; +} From patchwork Thu Sep 19 15:03:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152797 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4CF281599 for ; Thu, 19 Sep 2019 15:11:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2BCB821929 for ; Thu, 19 Sep 2019 15:11:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390925AbfISPJu (ORCPT ); Thu, 19 Sep 2019 11:09:50 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50047 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726007AbfISPJt (ORCPT ); Thu, 19 Sep 2019 11:09:49 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy3x-0006nE-Lc; Thu, 19 Sep 2019 17:09:41 +0200 Message-Id: <20190919150808.617944343@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:16 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 02/15] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Evaluating _TIF_NOHZ to decide whether to use the slow syscall entry path is not only pointless, it's actually counterproductive: 1) Context tracking code is invoked unconditionally before that flag is evaluated. 2) If the flag is set the slow path is invoked for nothing due to #1 Remove it. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/thread_info.h | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -133,14 +133,10 @@ struct thread_info { #define _TIF_X32 (1 << TIF_X32) #define _TIF_FSCHECK (1 << TIF_FSCHECK) -/* - * work to do in syscall_trace_enter(). Also includes TIF_NOHZ for - * enter_from_user_mode() - */ +/* Work to do before invoking the actual syscall. */ #define _TIF_WORK_SYSCALL_ENTRY \ (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_AUDIT | \ - _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \ - _TIF_NOHZ) + _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT) /* flags to check in __switch_to() */ #define _TIF_WORK_CTXSW_BASE \ From patchwork Thu Sep 19 15:03:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152771 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 98B6514ED for ; Thu, 19 Sep 2019 15:09:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1112421927 for ; Thu, 19 Sep 2019 15:09:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403919AbfISPJx (ORCPT ); Thu, 19 Sep 2019 11:09:53 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50057 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732323AbfISPJw (ORCPT ); Thu, 19 Sep 2019 11:09:52 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy3y-0006nZ-HC; Thu, 19 Sep 2019 17:09:42 +0200 Message-Id: <20190919150808.724554170@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:17 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 03/15] x86/entry: Use generic syscall entry function References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Replace the syscall entry work handling with the generic version, Provide the necessary helper inlines to handle the real architecture specific parts, e.g. audit and seccomp invocations. Signed-off-by: Thomas Gleixner --- arch/x86/Kconfig | 1 arch/x86/entry/common.c | 108 +++--------------------------------- arch/x86/include/asm/entry-common.h | 59 +++++++++++++++++++ arch/x86/include/asm/thread_info.h | 5 - 4 files changed, 70 insertions(+), 103 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -110,6 +110,7 @@ config X86 select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_VULNERABILITIES select GENERIC_EARLY_IOREMAP + select GENERIC_ENTRY select GENERIC_FIND_FIRST_BIT select GENERIC_IOMAP select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -10,13 +10,13 @@ #include #include #include +#include #include #include #include #include #include #include -#include #include #include #include @@ -34,7 +34,6 @@ #include #include -#define CREATE_TRACE_POINTS #include #ifdef CONFIG_CONTEXT_TRACKING @@ -48,86 +47,6 @@ static inline void enter_from_user_mode(void) {} #endif -static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch) -{ -#ifdef CONFIG_X86_64 - if (arch == AUDIT_ARCH_X86_64) { - audit_syscall_entry(regs->orig_ax, regs->di, - regs->si, regs->dx, regs->r10); - } else -#endif - { - audit_syscall_entry(regs->orig_ax, regs->bx, - regs->cx, regs->dx, regs->si); - } -} - -/* - * Returns the syscall nr to run (which should match regs->orig_ax) or -1 - * to skip the syscall. - */ -static long syscall_trace_enter(struct pt_regs *regs) -{ - u32 arch = in_ia32_syscall() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64; - - struct thread_info *ti = current_thread_info(); - unsigned long ret = 0; - u32 work; - - if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) - BUG_ON(regs != task_pt_regs(current)); - - work = READ_ONCE(ti->flags); - - if (work & (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU)) { - ret = tracehook_report_syscall_entry(regs); - if (ret || (work & _TIF_SYSCALL_EMU)) - return -1L; - } - -#ifdef CONFIG_SECCOMP - /* - * Do seccomp after ptrace, to catch any tracer changes. - */ - if (work & _TIF_SECCOMP) { - struct seccomp_data sd; - - sd.arch = arch; - sd.nr = regs->orig_ax; - sd.instruction_pointer = regs->ip; -#ifdef CONFIG_X86_64 - if (arch == AUDIT_ARCH_X86_64) { - sd.args[0] = regs->di; - sd.args[1] = regs->si; - sd.args[2] = regs->dx; - sd.args[3] = regs->r10; - sd.args[4] = regs->r8; - sd.args[5] = regs->r9; - } else -#endif - { - sd.args[0] = regs->bx; - sd.args[1] = regs->cx; - sd.args[2] = regs->dx; - sd.args[3] = regs->si; - sd.args[4] = regs->di; - sd.args[5] = regs->bp; - } - - ret = __secure_computing(&sd); - if (ret == -1) - return ret; - } -#endif - - if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT))) - trace_sys_enter(regs, regs->orig_ax); - - do_audit_syscall_entry(regs, arch); - - return ret ?: regs->orig_ax; -} - #define EXIT_TO_USERMODE_LOOP_FLAGS \ (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ _TIF_NEED_RESCHED | _TIF_USER_RETURN_NOTIFY | _TIF_PATCH_PENDING) @@ -277,13 +196,10 @@ static void syscall_slow_exit_work(struc #ifdef CONFIG_X86_64 __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs) { - struct thread_info *ti; - enter_from_user_mode(); local_irq_enable(); - ti = current_thread_info(); - if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY) - nr = syscall_trace_enter(regs); + + nr = syscall_enter_from_usermode(regs, nr); if (likely(nr < NR_syscalls)) { nr = array_index_nospec(nr, NR_syscalls); @@ -310,22 +226,18 @@ static void syscall_slow_exit_work(struc */ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs) { - struct thread_info *ti = current_thread_info(); unsigned int nr = (unsigned int)regs->orig_ax; #ifdef CONFIG_IA32_EMULATION - ti->status |= TS_COMPAT; + current_thread_info()->status |= TS_COMPAT; #endif - if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY) { - /* - * Subtlety here: if ptrace pokes something larger than - * 2^32-1 into orig_ax, this truncates it. This may or - * may not be necessary, but it matches the old asm - * behavior. - */ - nr = syscall_trace_enter(regs); - } + /* + * Subtlety here: if ptrace pokes something larger than 2^32-1 into + * orig_ax, this truncates it. This may or may not be necessary, + * but it matches the old asm behavior. + */ + nr = syscall_enter_from_usermode(regs, nr); if (likely(nr < IA32_NR_syscalls)) { nr = array_index_nospec(nr, IA32_NR_syscalls); --- /dev/null +++ b/arch/x86/include/asm/entry-common.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _ASM_X86_ENTRY_COMMON_H +#define _ASM_X86_ENTRY_COMMON_H + +#include +#include + +static inline long arch_syscall_enter_seccomp(struct pt_regs *regs) +{ +#ifdef CONFIG_SECCOMP + u32 arch = in_ia32_syscall() ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64; + struct seccomp_data sd; + + sd.arch = arch; + sd.nr = regs->orig_ax; + sd.instruction_pointer = regs->ip; + +#ifdef CONFIG_X86_64 + if (arch == AUDIT_ARCH_X86_64) { + sd.args[0] = regs->di; + sd.args[1] = regs->si; + sd.args[2] = regs->dx; + sd.args[3] = regs->r10; + sd.args[4] = regs->r8; + sd.args[5] = regs->r9; + } else +#endif + { + sd.args[0] = regs->bx; + sd.args[1] = regs->cx; + sd.args[2] = regs->dx; + sd.args[3] = regs->si; + sd.args[4] = regs->di; + sd.args[5] = regs->bp; + } + + return __secure_computing(&sd); +#else + return 0; +#endif +} +#define arch_syscall_enter_seccomp arch_syscall_enter_seccomp + +static inline void arch_syscall_enter_audit(struct pt_regs *regs) +{ +#ifdef CONFIG_X86_64 + if (in_ia32_syscall()) { + audit_syscall_entry(regs->orig_ax, regs->di, + regs->si, regs->dx, regs->r10); + } else +#endif + { + audit_syscall_entry(regs->orig_ax, regs->bx, + regs->cx, regs->dx, regs->si); + } +} +#define arch_syscall_enter_audit arch_syscall_enter_audit + +#endif --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -133,11 +133,6 @@ struct thread_info { #define _TIF_X32 (1 << TIF_X32) #define _TIF_FSCHECK (1 << TIF_FSCHECK) -/* Work to do before invoking the actual syscall. */ -#define _TIF_WORK_SYSCALL_ENTRY \ - (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU | _TIF_SYSCALL_AUDIT | \ - _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT) - /* flags to check in __switch_to() */ #define _TIF_WORK_CTXSW_BASE \ (_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP| \ From patchwork Thu Sep 19 15:03:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152769 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 926981599 for ; Thu, 19 Sep 2019 15:09:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 77AD421D7A for ; Thu, 19 Sep 2019 15:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390949AbfISPJw (ORCPT ); Thu, 19 Sep 2019 11:09:52 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50051 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732001AbfISPJv (ORCPT ); Thu, 19 Sep 2019 11:09:51 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy3z-0006nq-86; Thu, 19 Sep 2019 17:09:43 +0200 Message-Id: <20190919150808.830764150@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:18 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 04/15] arm64/entry: Use generic syscall entry function References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Replace the syscall entry work handling with the generic version, Provide the necessary helper inlines to handle the real architecture specific parts, e.g. audit and seccomp invocations. Signed-off-by: Thomas Gleixner --- arch/arm64/Kconfig | 1 arch/arm64/include/asm/entry-common.h | 39 ++++++++++++++++++++++++++++++++++ arch/arm64/kernel/ptrace.c | 29 ------------------------- arch/arm64/kernel/syscall.c | 15 +++++-------- 4 files changed, 47 insertions(+), 37 deletions(-) --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -99,6 +99,7 @@ config ARM64 select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_VULNERABILITIES select GENERIC_EARLY_IOREMAP + select GENERIC_ENTRY select GENERIC_IDLE_POLL_SETUP select GENERIC_IRQ_MULTI_HANDLER select GENERIC_IRQ_PROBE --- /dev/null +++ b/arch/arm64/include/asm/entry-common.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017 ARM Ltd. + */ +#ifndef __ASM_ENTRY_COMMON_H +#define __ASM_ENTRY_COMMON_H + +enum ptrace_syscall_dir { + PTRACE_SYSCALL_ENTER = 0, + PTRACE_SYSCALL_EXIT, +}; + +/* + * A scratch register (ip(r12) on AArch32, x7 on AArch64) is + * used to denote syscall entry/exit for the tracehooks + */ +static inline __must_check int arch_syscall_enter_tracehook(struct pt_regs *regs) +{ + int regno = (is_compat_task() ? 12 : 7); + unsigned long reg = regs->regs[regno]; + long ret; + + regs->regs[regno] = PTRACE_SYSCALL_ENTER; + ret = tracehook_report_syscall_entry(regs); + if (ret) + forget_syscall(regs); + regs->regs[regno] = reg; + return ret; +} +#define arch_syscall_enter_tracehook arch_syscall_enter_tracehook + +static inline void arch_syscall_enter_audit(struct pt_regs *regs) +{ + audit_syscall_entry(regs->syscallno, regs->orig_x0, regs->regs[1], + regs->regs[2], regs->regs[3]); +} +#define arch_syscall_enter_audit arch_syscall_enter_audit + +#endif --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -41,7 +42,6 @@ #include #include -#define CREATE_TRACE_POINTS #include struct pt_regs_offset { @@ -1779,11 +1779,6 @@ long arch_ptrace(struct task_struct *chi return ptrace_request(child, request, addr, data); } -enum ptrace_syscall_dir { - PTRACE_SYSCALL_ENTER = 0, - PTRACE_SYSCALL_EXIT, -}; - static void tracehook_report_syscall(struct pt_regs *regs, enum ptrace_syscall_dir dir) { @@ -1806,28 +1801,6 @@ static void tracehook_report_syscall(str regs->regs[regno] = saved_reg; } -int syscall_trace_enter(struct pt_regs *regs) -{ - if (test_thread_flag(TIF_SYSCALL_TRACE) || - test_thread_flag(TIF_SYSCALL_EMU)) { - tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER); - if (!in_syscall(regs) || test_thread_flag(TIF_SYSCALL_EMU)) - return -1; - } - - /* Do the secure computing after ptrace; failures should be fast. */ - if (secure_computing(NULL) == -1) - return -1; - - if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) - trace_sys_enter(regs, regs->syscallno); - - audit_syscall_entry(regs->syscallno, regs->orig_x0, regs->regs[1], - regs->regs[2], regs->regs[3]); - - return regs->syscallno; -} - void syscall_trace_exit(struct pt_regs *regs) { audit_syscall_exit(regs); --- a/arch/arm64/kernel/syscall.c +++ b/arch/arm64/kernel/syscall.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include #include #include @@ -58,7 +59,6 @@ static inline bool has_syscall_work(unsi return unlikely(flags & _TIF_SYSCALL_WORK); } -int syscall_trace_enter(struct pt_regs *regs); void syscall_trace_exit(struct pt_regs *regs); #ifdef CONFIG_ARM64_ERRATUM_1463225 @@ -97,19 +97,16 @@ static void el0_svc_common(struct pt_reg regs->orig_x0 = regs->regs[0]; regs->syscallno = scno; + /* Set default error number */ + regs->regs[0] = -ENOSYS; cortex_a76_erratum_1463225_svc_handler(); local_daif_restore(DAIF_PROCCTX); user_exit(); - if (has_syscall_work(flags)) { - /* set default errno for user-issued syscall(-1) */ - if (scno == NO_SYSCALL) - regs->regs[0] = -ENOSYS; - scno = syscall_trace_enter(regs); - if (scno == NO_SYSCALL) - goto trace_exit; - } + scno = syscall_enter_from_usermode(regs, scno); + if (scno == NO_SYSCALL) + goto trace_exit; invoke_syscall(regs, scno, sc_nr, syscall_table); From patchwork Thu Sep 19 15:03:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152791 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C2A514ED for ; Thu, 19 Sep 2019 15:11:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 83FFB21848 for ; Thu, 19 Sep 2019 15:11:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391065AbfISPKt (ORCPT ); Thu, 19 Sep 2019 11:10:49 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50062 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390934AbfISPJw (ORCPT ); Thu, 19 Sep 2019 11:09:52 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy40-0006o5-D6; Thu, 19 Sep 2019 17:09:44 +0200 Message-Id: <20190919150808.936484726@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:19 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 05/15] entry: Provide generic syscall exit function References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Similar to syscall entry all architectures have similar and pointlessly different code to handle pending work before returning from a syscall to user space. Provide a generic version. Signed-off-by: Thomas Gleixner --- include/linux/entry-common.h | 31 ++++++++++++++++++++++++ kernel/entry/common.c | 55 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 86 insertions(+) --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -45,6 +45,17 @@ _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_EMU | \ ARCH_SYSCALL_ENTER_WORK) +/* + * TIF flags handled in syscall_exit_to_usermode() + */ +#ifndef ARCH_SYSCALL_EXIT_WORK +# define ARCH_SYSCALL_EXIT_WORK (0) +#endif + +#define SYSCALL_EXIT_WORK \ + (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ + _TIF_SYSCALL_TRACEPOINT | ARCH_SYSCALL_EXIT_WORK) + /** * arch_syscall_enter_tracehook - Wrapper around tracehook_report_syscall_entry() * @@ -118,4 +129,24 @@ static inline long syscall_enter_from_us return syscall; } +/** + * arch_syscall_exit_tracehook - Wrapper around tracehook_report_syscall_exit() + * + * Defaults to tracehook_report_syscall_exit(). Can be replaced by + * architecture specific code. + * + * Invoked from syscall_exit_to_usermode() + */ +static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step); + +#ifndef arch_syscall_exit_tracehook +static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step) +{ + tracehook_report_syscall_exit(regs, step); +} +#endif + +/* Common syscall exit function */ +void syscall_exit_to_usermode(struct pt_regs *regs, long syscall, long retval); + #endif --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -31,3 +31,58 @@ long core_syscall_enter_from_usermode(st return ret ? : syscall; } + +#ifndef _TIF_SINGLESTEP +static inline bool report_single_step(unsigned long ti_work) +{ + return false; +} +#else +/* + * If TIF_SYSCALL_EMU is set, then the only reason to report is when + * TIF_SINGLESTEP is set (i.e. PTRACE_SYSEMU_SINGLESTEP). This syscall + * instruction has been already reported in syscall_enter_from_usermode(). + */ +#define SYSEMU_STEP (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU) + +static inline bool report_single_step(unsigned long ti_work) +{ + return (ti_work & SYSEMU_STEP) == _TIF_SINGLESTEP; +} +#endif + +static void syscall_exit_work(struct pt_regs *regs, long retval, + unsigned long ti_work) +{ + bool step; + + audit_syscall_exit(regs); + + if (ti_work & _TIF_SYSCALL_TRACEPOINT) + trace_sys_exit(regs, retval); + + step = report_single_step(ti_work); + if (step || ti_work & _TIF_SYSCALL_TRACE) + arch_syscall_exit_tracehook(regs, step); +} + +void syscall_exit_to_usermode(struct pt_regs *regs, long syscall, long retval) +{ + unsigned long ti_work; + + CT_WARN_ON(ct_state() != CONTEXT_KERNEL); + + if (IS_ENABLED(CONFIG_PROVE_LOCKING) && + WARN(irqs_disabled(), "syscall %ld left IRQs disabled", syscall)) + local_irq_enable(); + + rseq_syscall(regs); + + /* + * Handle work which needs to run exactly once per syscall exit + * with interrupts enabled. + */ + ti_work = READ_ONCE(current_thread_info()->flags); + if (unlikely(ti_work & SYSCALL_EXIT_WORK)) + syscall_exit_work(regs, retval, ti_work); +} From patchwork Thu Sep 19 15:03:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152789 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8C63F14ED for ; Thu, 19 Sep 2019 15:11:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6F27221D79 for ; Thu, 19 Sep 2019 15:11:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391075AbfISPKu (ORCPT ); Thu, 19 Sep 2019 11:10:50 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50065 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390941AbfISPJw (ORCPT ); Thu, 19 Sep 2019 11:09:52 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy41-0006oE-EB; Thu, 19 Sep 2019 17:09:45 +0200 Message-Id: <20190919150809.041298671@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:20 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 06/15] x86/entry: Use generic syscall exit functionality References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Replace the x86 variant with the generic version. Signed-off-by: Thomas Gleixner --- arch/x86/entry/common.c | 44 ------------------------------------ arch/x86/include/asm/entry-common.h | 2 + 2 files changed, 3 insertions(+), 43 deletions(-) --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -139,55 +139,13 @@ static void exit_to_usermode_loop(struct mds_user_clear_cpu_buffers(); } -#define SYSCALL_EXIT_WORK_FLAGS \ - (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ - _TIF_SINGLESTEP | _TIF_SYSCALL_TRACEPOINT) - -static void syscall_slow_exit_work(struct pt_regs *regs, u32 cached_flags) -{ - bool step; - - audit_syscall_exit(regs); - - if (cached_flags & _TIF_SYSCALL_TRACEPOINT) - trace_sys_exit(regs, regs->ax); - - /* - * If TIF_SYSCALL_EMU is set, we only get here because of - * TIF_SINGLESTEP (i.e. this is PTRACE_SYSEMU_SINGLESTEP). - * We already reported this syscall instruction in - * syscall_trace_enter(). - */ - step = unlikely( - (cached_flags & (_TIF_SINGLESTEP | _TIF_SYSCALL_EMU)) - == _TIF_SINGLESTEP); - if (step || cached_flags & _TIF_SYSCALL_TRACE) - tracehook_report_syscall_exit(regs, step); -} - /* * Called with IRQs on and fully valid regs. Returns with IRQs off in a * state such that we can immediately switch to user mode. */ __visible inline void syscall_return_slowpath(struct pt_regs *regs) { - struct thread_info *ti = current_thread_info(); - u32 cached_flags = READ_ONCE(ti->flags); - - CT_WARN_ON(ct_state() != CONTEXT_KERNEL); - - if (IS_ENABLED(CONFIG_PROVE_LOCKING) && - WARN(irqs_disabled(), "syscall %ld left IRQs disabled", regs->orig_ax)) - local_irq_enable(); - - rseq_syscall(regs); - - /* - * First do one-time work. If these work items are enabled, we - * want to run them exactly once per syscall exit with IRQs on. - */ - if (unlikely(cached_flags & SYSCALL_EXIT_WORK_FLAGS)) - syscall_slow_exit_work(regs, cached_flags); + syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax); local_irq_disable(); prepare_exit_to_usermode(regs); --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -5,6 +5,8 @@ #include #include +#define ARCH_SYSCALL_EXIT_WORK (_TIF_SINGLESTEP) + static inline long arch_syscall_enter_seccomp(struct pt_regs *regs) { #ifdef CONFIG_SECCOMP From patchwork Thu Sep 19 15:03:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152793 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 90E7214ED for ; Thu, 19 Sep 2019 15:11:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7043521A49 for ; Thu, 19 Sep 2019 15:11:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391054AbfISPKt (ORCPT ); Thu, 19 Sep 2019 11:10:49 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50069 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390943AbfISPJw (ORCPT ); Thu, 19 Sep 2019 11:09:52 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy42-0006oN-8x; Thu, 19 Sep 2019 17:09:46 +0200 Message-Id: <20190919150809.145400160@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:21 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 07/15] arm64/syscall: Remove obscure flag check References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The syscall handling code has an obscure check of pending work which does a shortcut before returning to user space. It calls into the exit work code when the flags at entry time required an entry into the slowpath. That does not make sense because the underlying work functionality will reevaluate the flags anyway and not do anything. Replace that by a straight forward test for work flags. Preparatory change for switching to the generic syscall exit work handling code. Signed-off-by: Thomas Gleixner --- arch/arm64/kernel/syscall.c | 32 ++++++-------------------------- 1 file changed, 6 insertions(+), 26 deletions(-) --- a/arch/arm64/kernel/syscall.c +++ b/arch/arm64/kernel/syscall.c @@ -93,8 +93,6 @@ static void cortex_a76_erratum_1463225_s static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr, const syscall_fn_t syscall_table[]) { - unsigned long flags = current_thread_info()->flags; - regs->orig_x0 = regs->regs[0]; regs->syscallno = scno; /* Set default error number */ @@ -105,33 +103,15 @@ static void el0_svc_common(struct pt_reg user_exit(); scno = syscall_enter_from_usermode(regs, scno); - if (scno == NO_SYSCALL) - goto trace_exit; - - invoke_syscall(regs, scno, sc_nr, syscall_table); + if (scno != NO_SYSCALL) + invoke_syscall(regs, scno, sc_nr, syscall_table); - /* - * The tracing status may have changed under our feet, so we have to - * check again. However, if we were tracing entry, then we always trace - * exit regardless, as the old entry assembly did. - */ - if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) { - local_daif_mask(); - flags = current_thread_info()->flags; - if (!has_syscall_work(flags)) { - /* - * We're off to userspace, where interrupts are - * always enabled after we restore the flags from - * the SPSR. - */ - trace_hardirqs_on(); - return; - } + local_daif_mask(); + if (has_syscall_work(current_thread_info()->flags) || + IS_ENABLED(CONFIG_DEBUG_RSEQ)) { local_daif_restore(DAIF_PROCCTX); + syscall_trace_exit(regs); } - -trace_exit: - syscall_trace_exit(regs); } static inline void sve_user_discard(void) From patchwork Thu Sep 19 15:03:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152787 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE4301599 for ; Thu, 19 Sep 2019 15:10:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9221321928 for ; Thu, 19 Sep 2019 15:10:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391037AbfISPKl (ORCPT ); Thu, 19 Sep 2019 11:10:41 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50078 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390955AbfISPJy (ORCPT ); Thu, 19 Sep 2019 11:09:54 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy42-0006oe-Vu; Thu, 19 Sep 2019 17:09:47 +0200 Message-Id: <20190919150809.235562522@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:22 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 08/15] arm64/syscall: Use generic syscall exit functionality References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Replace the syscall exit handling code with the generic version. That gets rid of the interrupts disabled check for work flags. That's a non-issue because the flags are checked with READ_ONCE() in the core and not reevaluated. That's perfectly fine because the interrupts disabled check is also just a snapshot. Signed-off-by: Thomas Gleixner --- arch/arm64/include/asm/entry-common.h | 11 +++++++++ arch/arm64/kernel/ptrace.c | 38 ---------------------------------- arch/arm64/kernel/syscall.c | 14 ------------ 3 files changed, 12 insertions(+), 51 deletions(-) --- a/arch/arm64/include/asm/entry-common.h +++ b/arch/arm64/include/asm/entry-common.h @@ -29,6 +29,17 @@ static inline __must_check int arch_sysc } #define arch_syscall_enter_tracehook arch_syscall_enter_tracehook +static inline void arch_syscall_exit_tracehook(struct pt_regs *regs, bool step) +{ + int regno = (is_compat_task() ? 12 : 7); + unsigned long reg = regs->regs[regno]; + + regs->regs[regno] = PTRACE_SYSCALL_EXIT; + tracehook_report_syscall_exit(regs, step); + regs->regs[regno] = reg; +} +#define arch_syscall_exit_tracehook arch_syscall_exit_tracehook + static inline void arch_syscall_enter_audit(struct pt_regs *regs) { audit_syscall_entry(regs->syscallno, regs->orig_x0, regs->regs[1], --- a/arch/arm64/kernel/ptrace.c +++ b/arch/arm64/kernel/ptrace.c @@ -13,13 +13,11 @@ #include #include #include -#include #include #include #include #include #include -#include #include #include #include @@ -28,7 +26,6 @@ #include #include #include -#include #include #include @@ -1779,41 +1776,6 @@ long arch_ptrace(struct task_struct *chi return ptrace_request(child, request, addr, data); } -static void tracehook_report_syscall(struct pt_regs *regs, - enum ptrace_syscall_dir dir) -{ - int regno; - unsigned long saved_reg; - - /* - * A scratch register (ip(r12) on AArch32, x7 on AArch64) is - * used to denote syscall entry/exit: - */ - regno = (is_compat_task() ? 12 : 7); - saved_reg = regs->regs[regno]; - regs->regs[regno] = dir; - - if (dir == PTRACE_SYSCALL_EXIT) - tracehook_report_syscall_exit(regs, 0); - else if (tracehook_report_syscall_entry(regs)) - forget_syscall(regs); - - regs->regs[regno] = saved_reg; -} - -void syscall_trace_exit(struct pt_regs *regs) -{ - audit_syscall_exit(regs); - - if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) - trace_sys_exit(regs, regs_return_value(regs)); - - if (test_thread_flag(TIF_SYSCALL_TRACE)) - tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT); - - rseq_syscall(regs); -} - /* * SPSR_ELx bits which are always architecturally RES0 per ARM DDI 0487D.a. * We permit userspace to set SSBS (AArch64 bit 12, AArch32 bit 23) which is --- a/arch/arm64/kernel/syscall.c +++ b/arch/arm64/kernel/syscall.c @@ -54,13 +54,6 @@ static void invoke_syscall(struct pt_reg regs->regs[0] = ret; } -static inline bool has_syscall_work(unsigned long flags) -{ - return unlikely(flags & _TIF_SYSCALL_WORK); -} - -void syscall_trace_exit(struct pt_regs *regs); - #ifdef CONFIG_ARM64_ERRATUM_1463225 DECLARE_PER_CPU(int, __in_cortex_a76_erratum_1463225_wa); @@ -106,12 +99,7 @@ static void el0_svc_common(struct pt_reg if (scno != NO_SYSCALL) invoke_syscall(regs, scno, sc_nr, syscall_table); - local_daif_mask(); - if (has_syscall_work(current_thread_info()->flags) || - IS_ENABLED(CONFIG_DEBUG_RSEQ)) { - local_daif_restore(DAIF_PROCCTX); - syscall_trace_exit(regs); - } + syscall_exit_to_usermode(regs, scno, regs_return_value(regs)); } static inline void sve_user_discard(void) From patchwork Thu Sep 19 15:03:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152785 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F8C51599 for ; Thu, 19 Sep 2019 15:10:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 49E4821924 for ; Thu, 19 Sep 2019 15:10:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391020AbfISPKe (ORCPT ); Thu, 19 Sep 2019 11:10:34 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50083 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403894AbfISPJz (ORCPT ); Thu, 19 Sep 2019 11:09:55 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy44-0006om-B7; Thu, 19 Sep 2019 17:09:48 +0200 Message-Id: <20190919150809.340471236@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:23 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 09/15] entry: Provide generic exit to usermode functionality References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Provide a generic facility to handle the exit to usermode work. That's aimed to replace the pointlessly different copies in each architecture. Signed-off-by: Thomas Gleixner --- include/linux/entry-common.h | 105 +++++++++++++++++++++++++++++++++++++++++++ kernel/entry/common.c | 88 ++++++++++++++++++++++++++++++++++++ 2 files changed, 193 insertions(+) --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -34,6 +34,30 @@ # define _TIF_AUDIT (0) #endif +#ifndef _TIF_UPROBE +# define _TIF_UPROBE (0) +#endif + +#ifndef _TIF_PATCH_PENDING +# define _TIF_PATCH_PENDING (0) +#endif + +#ifndef _TIF_NOTIFY_RESUME +# define _TIF_NOTIFY_RESUME (0) +#endif + +/* + * TIF flags handled in exit_to_usermode() + */ +#ifndef ARCH_EXIT_TO_USERMODE_WORK +# define ARCH_EXIT_TO_USERMODE_WORK (0) +#endif + +#define EXIT_TO_USERMODE_WORK \ + (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ + _TIF_NEED_RESCHED | _TIF_PATCH_PENDING | \ + ARCH_EXIT_TO_USERMODE_WORK) + /* * TIF flags handled in syscall_enter_from_usermode() */ @@ -58,6 +82,87 @@ _TIF_SYSCALL_TRACEPOINT | ARCH_SYSCALL_EXIT_WORK) /** + * local_irq_enable_exit_to_user - Exit to user variant of local_irq_enable() + * @ti_work: Cached TIF flags gathered with interrupts disabled + * + * Defaults to local_irq_enable(). Can be supplied by architecture specific + * code. + */ +static inline void local_irq_enable_exit_to_user(unsigned long ti_work); + +#ifndef local_irq_enable_exit_to_user +static inline void local_irq_enable_exit_to_user(unsigned long ti_work) +{ + local_irq_enable(); +} +#endif + +/** + * local_irq_disable_exit_to_user - Exit to user variant of local_irq_disable() + * + * Defaults to local_irq_disable(). Can be supplied by architecture specific + * code. + */ +static inline void local_irq_disable_exit_to_user(void); + +#ifndef local_irq_disable_exit_to_user +static inline void local_irq_disable_exit_to_user(void) +{ + local_irq_disable(); +} +#endif + +/** + * arch_exit_to_usermode_work - Architecture specific TIF work for + * exit to user mode. + * @regs: Pointer to currents pt_regs + * @ti_work: Cached TIF flags gathered with interrupts disabled + * + * Invoked from exit_to_usermode() with interrupt disabled + * + * Defaults to NOOP. Can be supplied by architecture specific code. + */ +static inline void arch_exit_to_usermode_work(struct pt_regs *regs, + unsigned long ti_work); + +#ifndef arch_exit_to_usermode_work +static inline void arch_exit_to_usermode_work(struct pt_regs *regs, + unsigned long ti_work) +{ +} +#endif + +/** + * arch_exit_to_usermode - Architecture specific preparation for + * exit to user mode. + * @regs: Pointer to currents pt_regs + * @ti_work: Cached TIF flags gathered with interrupts disabled + * + * Invoked from exit_to_usermode() with interrupt disabled as the last + * function before return. + */ +static inline void arch_exit_to_usermode(struct pt_regs *regs, + unsigned long ti_work); + +#ifndef arch_exit_to_usermode +static inline void arch_exit_to_usermode(struct pt_regs *regs, + unsigned long ti_work) +{ +} +#endif + +/* Common exit to usermode function to handle TIF work */ +asmlinkage __visible void exit_to_usermode(struct pt_regs *regs); + +/** + * arch_do_signal - Architecture specific signal delivery function + * @regs: Pointer to currents pt_regs + * + * Invoked from exit_to_usermode() + */ +void arch_do_signal(struct pt_regs *regs); + +/** * arch_syscall_enter_tracehook - Wrapper around tracehook_report_syscall_entry() * * Defaults to tracehook_report_syscall_entry(). Can be replaced by --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -2,10 +2,90 @@ #include #include +#include +#include #define CREATE_TRACE_POINTS #include +static unsigned long core_exit_to_usermode_work(struct pt_regs *regs, + unsigned long ti_work) +{ + /* + * Before returning to user space ensure that all pending work + * items have been completed. + */ + while (ti_work & EXIT_TO_USERMODE_WORK) { + + local_irq_enable_exit_to_user(ti_work); + + if (ti_work & _TIF_NEED_RESCHED) + schedule(); + + if (ti_work & _TIF_UPROBE) + uprobe_notify_resume(regs); + + if (ti_work & _TIF_PATCH_PENDING) + klp_update_patch_state(current); + + if (ti_work & _TIF_SIGPENDING) + arch_do_signal(regs); + + if (ti_work & _TIF_NOTIFY_RESUME) { + clear_thread_flag(TIF_NOTIFY_RESUME); + tracehook_notify_resume(regs); + rseq_handle_notify_resume(NULL, regs); + } + + /* Architecture specific TIF work */ + arch_exit_to_usermode_work(regs, ti_work); + + /* + * Disable interrupts and reevaluate the work flags as they + * might have changed while interrupts and preemption was + * enabled above. + */ + local_irq_disable_exit_to_user(); + ti_work = READ_ONCE(current_thread_info()->flags); + } + /* + * Was checked in exit_to_usermode_work() already, but the above + * loop might have wreckaged it. + */ + addr_limit_user_check(); + return ti_work; +} + +static void do_exit_to_usermode(struct pt_regs *regs) +{ + unsigned long ti_work = READ_ONCE(current_thread_info()->flags); + + lockdep_sys_exit(); + + addr_limit_user_check(); + + if (unlikely(ti_work & EXIT_TO_USERMODE_WORK)) + ti_work = core_exit_to_usermode_work(regs, ti_work); + + arch_exit_to_usermode(regs, ti_work); + /* Return to userspace right after this which turns on interrupts */ + trace_hardirqs_on(); +} + +/** + * exit_to_usermode - Check and handle pending work which needs to be + * handled before returning to user mode + * @regs: Pointer to currents pt_regs + * + * Called and returns with interrupts disabled + */ +asmlinkage __visible void exit_to_usermode(struct pt_regs *regs) +{ + trace_hardirqs_off(); + lockdep_assert_irqs_disabled(); + do_exit_to_usermode(regs); +} + long core_syscall_enter_from_usermode(struct pt_regs *regs, long syscall) { unsigned long ti_work = READ_ONCE(current_thread_info()->flags); @@ -85,4 +165,12 @@ void syscall_exit_to_usermode(struct pt_ ti_work = READ_ONCE(current_thread_info()->flags); if (unlikely(ti_work & SYSCALL_EXIT_WORK)) syscall_exit_work(regs, retval, ti_work); + +#ifdef ARCH_EXIT_TO_USER_FROM_SYSCALL_EXIT + /* + * Disable interrupts and handle the regular exit to user mode work + */ + local_irq_disable_exit_to_user(); + do_exit_to_usermode(regs); +#endif } From patchwork Thu Sep 19 15:03:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152783 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 24CA114ED for ; Thu, 19 Sep 2019 15:10:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 018CB21924 for ; Thu, 19 Sep 2019 15:10:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391004AbfISPK2 (ORCPT ); Thu, 19 Sep 2019 11:10:28 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50087 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403922AbfISPJz (ORCPT ); Thu, 19 Sep 2019 11:09:55 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy45-0006oz-4o; Thu, 19 Sep 2019 17:09:49 +0200 Message-Id: <20190919150809.446771597@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:24 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 10/15] x86/entry: Move irq tracing to C code References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To prepare for converting the exit to usermode code to the generic version, move the irqflags tracing into C code. Signed-off-by: Thomas Gleixner --- arch/x86/entry/common.c | 10 ++++++++++ arch/x86/entry/entry_32.S | 11 +---------- arch/x86/entry/entry_64.S | 10 ++-------- arch/x86/entry/entry_64_compat.S | 21 --------------------- 4 files changed, 13 insertions(+), 39 deletions(-) --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -102,6 +102,8 @@ static void exit_to_usermode_loop(struct struct thread_info *ti = current_thread_info(); u32 cached_flags; + trace_hardirqs_off(); + addr_limit_user_check(); lockdep_assert_irqs_disabled(); @@ -137,6 +139,8 @@ static void exit_to_usermode_loop(struct user_enter_irqoff(); mds_user_clear_cpu_buffers(); + + trace_hardirqs_on(); } /* @@ -154,6 +158,8 @@ static void exit_to_usermode_loop(struct #ifdef CONFIG_X86_64 __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs) { + /* User to kernel transition disabled interrupts. */ + trace_hardirqs_off(); enter_from_user_mode(); local_irq_enable(); @@ -221,6 +227,7 @@ static __always_inline void do_syscall_3 /* Handles int $0x80 */ __visible void do_int80_syscall_32(struct pt_regs *regs) { + trace_hardirqs_off(); enter_from_user_mode(); local_irq_enable(); do_syscall_32_irqs_on(regs); @@ -237,6 +244,9 @@ static __always_inline void do_syscall_3 unsigned long landing_pad = (unsigned long)current->mm->context.vdso + vdso_image_32.sym_int80_landing_pad; + /* User to kernel transition disabled interrupts. */ + trace_hardirqs_off(); + /* * SYSENTER loses EIP, and even SYSCALL32 needs us to skip forward * so that 'regs->ip -= 2' lands back on an int $0x80 instruction. --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -827,7 +827,6 @@ END(ret_from_fork) ENTRY(resume_userspace) DISABLE_INTERRUPTS(CLBR_ANY) - TRACE_IRQS_OFF movl %esp, %eax call prepare_exit_to_usermode jmp restore_all @@ -1049,12 +1048,6 @@ ENTRY(entry_INT80_32) SAVE_ALL pt_regs_ax=$-ENOSYS switch_stacks=1 /* save rest */ - /* - * User mode is traced as though IRQs are on, and the interrupt gate - * turned them off. - */ - TRACE_IRQS_OFF - movl %esp, %eax call do_int80_syscall_32 .Lsyscall_32_done: @@ -1062,11 +1055,8 @@ ENTRY(entry_INT80_32) STACKLEAK_ERASE restore_all: - TRACE_IRQS_IRET SWITCH_TO_ENTRY_STACK -.Lrestore_all_notrace: CHECK_AND_APPLY_ESPFIX -.Lrestore_nocheck: /* Switch back to user CR3 */ SWITCH_TO_USER_CR3 scratch_reg=%eax @@ -1086,6 +1076,7 @@ ENTRY(entry_INT80_32) restore_all_kernel: #ifdef CONFIG_PREEMPTION DISABLE_INTERRUPTS(CLBR_ANY) + TRACE_IRQS_OFF cmpl $0, PER_CPU_VAR(__preempt_count) jnz .Lno_preempt testl $X86_EFLAGS_IF, PT_EFLAGS(%esp) # interrupts off (exception path) ? --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -167,15 +167,11 @@ GLOBAL(entry_SYSCALL_64_after_hwframe) PUSH_AND_CLEAR_REGS rax=$-ENOSYS - TRACE_IRQS_OFF - /* IRQs are off. */ movq %rax, %rdi movq %rsp, %rsi call do_syscall_64 /* returns with IRQs disabled */ - TRACE_IRQS_IRETQ /* we're about to change IF */ - /* * Try to use SYSRET instead of IRET if we're returning to * a completely clean 64-bit userspace context. If we're not, @@ -342,7 +338,6 @@ ENTRY(ret_from_fork) UNWIND_HINT_REGS movq %rsp, %rdi call syscall_return_slowpath /* returns with IRQs disabled */ - TRACE_IRQS_ON /* user mode is traced as IRQS on */ jmp swapgs_restore_regs_and_return_to_usermode 1: @@ -608,7 +603,6 @@ END(common_spurious) /* 0(%rsp): old RSP */ ret_from_intr: DISABLE_INTERRUPTS(CLBR_ANY) - TRACE_IRQS_OFF LEAVE_IRQ_STACK @@ -619,7 +613,6 @@ END(common_spurious) GLOBAL(retint_user) mov %rsp,%rdi call prepare_exit_to_usermode - TRACE_IRQS_IRETQ GLOBAL(swapgs_restore_regs_and_return_to_usermode) #ifdef CONFIG_DEBUG_ENTRY @@ -666,6 +659,7 @@ GLOBAL(swapgs_restore_regs_and_return_to retint_kernel: #ifdef CONFIG_PREEMPTION /* Interrupts are off */ + TRACE_IRQS_OFF /* Check if we need preemption */ btl $9, EFLAGS(%rsp) /* were interrupts off? */ jnc 1f @@ -1367,9 +1361,9 @@ ENTRY(error_entry) END(error_entry) ENTRY(error_exit) - UNWIND_HINT_REGS DISABLE_INTERRUPTS(CLBR_ANY) TRACE_IRQS_OFF + UNWIND_HINT_REGS testb $3, CS(%rsp) jz retint_kernel jmp retint_user --- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -129,12 +129,6 @@ ENTRY(entry_SYSENTER_compat) jnz .Lsysenter_fix_flags .Lsysenter_flags_fixed: - /* - * User mode is traced as though IRQs are on, and SYSENTER - * turned them off. - */ - TRACE_IRQS_OFF - movq %rsp, %rdi call do_fast_syscall_32 /* XEN PV guests always use IRET path */ @@ -247,12 +241,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram pushq $0 /* pt_regs->r15 = 0 */ xorl %r15d, %r15d /* nospec r15 */ - /* - * User mode is traced as though IRQs are on, and SYSENTER - * turned them off. - */ - TRACE_IRQS_OFF - movq %rsp, %rdi call do_fast_syscall_32 /* XEN PV guests always use IRET path */ @@ -266,7 +254,6 @@ GLOBAL(entry_SYSCALL_compat_after_hwfram * stack. So let's erase the thread stack right now. */ STACKLEAK_ERASE - TRACE_IRQS_ON /* User mode traces as IRQs on. */ movq RBX(%rsp), %rbx /* pt_regs->rbx */ movq RBP(%rsp), %rbp /* pt_regs->rbp */ movq EFLAGS(%rsp), %r11 /* pt_regs->flags (in r11) */ @@ -403,17 +390,9 @@ ENTRY(entry_INT80_compat) xorl %r15d, %r15d /* nospec r15 */ cld - /* - * User mode is traced as though IRQs are on, and the interrupt - * gate turned them off. - */ - TRACE_IRQS_OFF - movq %rsp, %rdi call do_int80_syscall_32 .Lsyscall_32_done: - /* Go back to user mode. */ - TRACE_IRQS_ON jmp swapgs_restore_regs_and_return_to_usermode END(entry_INT80_compat) From patchwork Thu Sep 19 15:03:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152777 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA3571599 for ; Thu, 19 Sep 2019 15:10:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C920421927 for ; Thu, 19 Sep 2019 15:10:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390974AbfISPKA (ORCPT ); Thu, 19 Sep 2019 11:10:00 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50100 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403946AbfISPJ6 (ORCPT ); Thu, 19 Sep 2019 11:09:58 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy46-0006p7-3c; Thu, 19 Sep 2019 17:09:50 +0200 Message-Id: <20190919150809.553546825@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:25 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 11/15] x86/entry: Use generic exit to usermode References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Replace the x86 specific exit to usermode code with the generic implementation. Signed-off-by: Thomas Gleixner --- arch/x86/entry/common.c | 111 ------------------------------------ arch/x86/entry/entry_32.S | 2 arch/x86/entry/entry_64.S | 2 arch/x86/include/asm/entry-common.h | 47 ++++++++++++++- arch/x86/include/asm/signal.h | 1 arch/x86/kernel/signal.c | 2 6 files changed, 51 insertions(+), 114 deletions(-) --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -15,15 +15,9 @@ #include #include #include -#include -#include #include #include -#include -#include #include -#include -#include #include #include @@ -47,102 +41,6 @@ static inline void enter_from_user_mode(void) {} #endif -#define EXIT_TO_USERMODE_LOOP_FLAGS \ - (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ - _TIF_NEED_RESCHED | _TIF_USER_RETURN_NOTIFY | _TIF_PATCH_PENDING) - -static void exit_to_usermode_loop(struct pt_regs *regs, u32 cached_flags) -{ - /* - * In order to return to user mode, we need to have IRQs off with - * none of EXIT_TO_USERMODE_LOOP_FLAGS set. Several of these flags - * can be set at any time on preemptible kernels if we have IRQs on, - * so we need to loop. Disabling preemption wouldn't help: doing the - * work to clear some of the flags can sleep. - */ - while (true) { - /* We have work to do. */ - local_irq_enable(); - - if (cached_flags & _TIF_NEED_RESCHED) - schedule(); - - if (cached_flags & _TIF_UPROBE) - uprobe_notify_resume(regs); - - if (cached_flags & _TIF_PATCH_PENDING) - klp_update_patch_state(current); - - /* deal with pending signal delivery */ - if (cached_flags & _TIF_SIGPENDING) - do_signal(regs); - - if (cached_flags & _TIF_NOTIFY_RESUME) { - clear_thread_flag(TIF_NOTIFY_RESUME); - tracehook_notify_resume(regs); - rseq_handle_notify_resume(NULL, regs); - } - - if (cached_flags & _TIF_USER_RETURN_NOTIFY) - fire_user_return_notifiers(); - - /* Disable IRQs and retry */ - local_irq_disable(); - - cached_flags = READ_ONCE(current_thread_info()->flags); - - if (!(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS)) - break; - } -} - -/* Called with IRQs disabled. */ -__visible inline void prepare_exit_to_usermode(struct pt_regs *regs) -{ - struct thread_info *ti = current_thread_info(); - u32 cached_flags; - - trace_hardirqs_off(); - - addr_limit_user_check(); - - lockdep_assert_irqs_disabled(); - lockdep_sys_exit(); - - cached_flags = READ_ONCE(ti->flags); - - if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS)) - exit_to_usermode_loop(regs, cached_flags); - - /* Reload ti->flags; we may have rescheduled above. */ - cached_flags = READ_ONCE(ti->flags); - - fpregs_assert_state_consistent(); - if (unlikely(cached_flags & _TIF_NEED_FPU_LOAD)) - switch_fpu_return(); - -#ifdef CONFIG_COMPAT - /* - * Compat syscalls set TS_COMPAT. Make sure we clear it before - * returning to user mode. We need to clear it *after* signal - * handling, because syscall restart has a fixup for compat - * syscalls. The fixup is exercised by the ptrace_syscall_32 - * selftest. - * - * We also need to clear TS_REGS_POKED_I386: the 32-bit tracer - * special case only applies after poking regs and before the - * very next return to user mode. - */ - ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED); -#endif - - user_enter_irqoff(); - - mds_user_clear_cpu_buffers(); - - trace_hardirqs_on(); -} - /* * Called with IRQs on and fully valid regs. Returns with IRQs off in a * state such that we can immediately switch to user mode. @@ -150,9 +48,6 @@ static void exit_to_usermode_loop(struct __visible inline void syscall_return_slowpath(struct pt_regs *regs) { syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax); - - local_irq_disable(); - prepare_exit_to_usermode(regs); } #ifdef CONFIG_X86_64 @@ -177,7 +72,7 @@ static void exit_to_usermode_loop(struct #endif } - syscall_return_slowpath(regs); + syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax); } #endif @@ -221,7 +116,7 @@ static __always_inline void do_syscall_3 #endif /* CONFIG_IA32_EMULATION */ } - syscall_return_slowpath(regs); + syscall_exit_to_usermode(regs, regs->orig_ax, regs->ax); } /* Handles int $0x80 */ @@ -276,7 +171,7 @@ static __always_inline void do_syscall_3 /* User code screwed up. */ local_irq_disable(); regs->ax = -EFAULT; - prepare_exit_to_usermode(regs); + exit_to_usermode(regs); return 0; /* Keep it simple: use IRET. */ } --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -828,7 +828,7 @@ END(ret_from_fork) ENTRY(resume_userspace) DISABLE_INTERRUPTS(CLBR_ANY) movl %esp, %eax - call prepare_exit_to_usermode + call exit_to_usermode jmp restore_all END(ret_from_exception) --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -612,7 +612,7 @@ END(common_spurious) /* Interrupt came from user space */ GLOBAL(retint_user) mov %rsp,%rdi - call prepare_exit_to_usermode + call exit_to_usermode GLOBAL(swapgs_restore_regs_and_return_to_usermode) #ifdef CONFIG_DEBUG_ENTRY --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -2,11 +2,54 @@ #ifndef _ASM_X86_ENTRY_COMMON_H #define _ASM_X86_ENTRY_COMMON_H -#include -#include +#include +#include + +#include +#include #define ARCH_SYSCALL_EXIT_WORK (_TIF_SINGLESTEP) +#define ARCH_EXIT_TO_USERMODE_WORK (_TIF_USER_RETURN_NOTIFY) + +#define ARCH_EXIT_TO_USER_FROM_SYSCALL_EXIT + +static inline void arch_exit_to_usermode_work(struct pt_regs *regs, + unsigned long ti_work) +{ + if (ti_work & _TIF_USER_RETURN_NOTIFY) + fire_user_return_notifiers(); +} +#define arch_exit_to_usermode_work arch_exit_to_usermode_work + +static inline void arch_exit_to_usermode(struct pt_regs *regs, + unsigned long ti_work) +{ + fpregs_assert_state_consistent(); + if (unlikely(ti_work & _TIF_NEED_FPU_LOAD)) + switch_fpu_return(); + +#ifdef CONFIG_COMPAT + /* + * Compat syscalls set TS_COMPAT. Make sure we clear it before + * returning to user mode. We need to clear it *after* signal + * handling, because syscall restart has a fixup for compat + * syscalls. The fixup is exercised by the ptrace_syscall_32 + * selftest. + * + * We also need to clear TS_REGS_POKED_I386: the 32-bit tracer + * special case only applies after poking regs and before the + * very next return to user mode. + */ + current_thread_info()->status &= ~(TS_COMPAT | TS_I386_REGS_POKED); +#endif + + user_enter_irqoff(); + + mds_user_clear_cpu_buffers(); +} +#define arch_exit_to_usermode arch_exit_to_usermode + static inline long arch_syscall_enter_seccomp(struct pt_regs *regs) { #ifdef CONFIG_SECCOMP --- a/arch/x86/include/asm/signal.h +++ b/arch/x86/include/asm/signal.h @@ -35,7 +35,6 @@ typedef sigset_t compat_sigset_t; #endif /* __ASSEMBLY__ */ #include #ifndef __ASSEMBLY__ -extern void do_signal(struct pt_regs *regs); #define __ARCH_HAS_SA_RESTORER --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -808,7 +808,7 @@ static inline unsigned long get_nr_resta * want to handle. Thus you cannot kill init even with a SIGKILL even by * mistake. */ -void do_signal(struct pt_regs *regs) +void arch_do_signal(struct pt_regs *regs) { struct ksignal ksig; From patchwork Thu Sep 19 15:03:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152781 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 881B614ED for ; Thu, 19 Sep 2019 15:10:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7039D21924 for ; Thu, 19 Sep 2019 15:10:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390968AbfISPJ7 (ORCPT ); Thu, 19 Sep 2019 11:09:59 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50101 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403949AbfISPJ6 (ORCPT ); Thu, 19 Sep 2019 11:09:58 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy47-0006pI-4k; Thu, 19 Sep 2019 17:09:51 +0200 Message-Id: <20190919150809.652455369@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:26 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 12/15] arm64/entry: Use generic exit to usermode References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Replace the exit to usermode code with the generic version. Signed-off-by: Thomas Gleixner --- arch/arm64/include/asm/entry-common.h | 29 +++++++++++++++++++++ arch/arm64/kernel/entry.S | 18 ++----------- arch/arm64/kernel/signal.c | 45 ---------------------------------- 3 files changed, 33 insertions(+), 59 deletions(-) --- a/arch/arm64/include/asm/entry-common.h +++ b/arch/arm64/include/asm/entry-common.h @@ -5,6 +5,35 @@ #ifndef __ASM_ENTRY_COMMON_H #define __ASM_ENTRY_COMMON_H +#include + +#define ARCH_EXIT_TO_USERMODE_WORK (_TIF_FOREIGN_FPSTATE) + +static inline void local_irq_enable_exit_to_user(unsigned long ti_work) +{ + if (ti_work & _TIF_NEED_RESCHED) + local_daif_restore(DAIF_PROCCTX_NOIRQ); + else + local_daif_restore(DAIF_PROCCTX); +} +#define local_irq_enable_exit_to_user local_irq_enable_exit_to_user + +static inline void local_irq_disable_exit_to_user(void) +{ + local_daif_mask(); +} +#define local_irq_disable_exit_to_user local_irq_disable_exit_to_user + +static inline void arch_exit_to_usermode_work(struct pt_regs *regs, + unsigned long ti_work) +{ + /* Must this be inside the work loop ? */ + if (ti_work & _TIF_FOREIGN_FPSTATE) + fpsimd_restore_current_state(); + +} +#define arch_exit_to_usermode_work arch_exit_to_usermode_work + enum ptrace_syscall_dir { PTRACE_SYSCALL_ENTER = 0, PTRACE_SYSCALL_EXIT, --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -971,26 +971,14 @@ ENDPROC(el1_error) ENDPROC(el0_error) /* - * Ok, we need to do extra processing, enter the slow path. - */ -work_pending: - mov x0, sp // 'regs' - bl do_notify_resume -#ifdef CONFIG_TRACE_IRQFLAGS - bl trace_hardirqs_on // enabled while in userspace -#endif - ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for single-step - b finish_ret_to_user -/* * "slow" syscall return path. */ ret_to_user: disable_daif gic_prio_kentry_setup tmp=x3 - ldr x1, [tsk, #TSK_TI_FLAGS] - and x2, x1, #_TIF_WORK_MASK - cbnz x2, work_pending -finish_ret_to_user: + mov x0, sp // 'regs' + bl exit_to_usermode + ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for single-step enable_step_tsk x1, x2 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK bl stackleak_erase --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -825,7 +825,7 @@ static void handle_signal(struct ksignal * the kernel can handle, and then we build all the user-level signal handling * stack-frames in one go after that. */ -static void do_signal(struct pt_regs *regs) +void arch_do_signal(struct pt_regs *regs) { unsigned long continue_addr = 0, restart_addr = 0; int retval = 0; @@ -896,49 +896,6 @@ static void do_signal(struct pt_regs *re restore_saved_sigmask(); } -asmlinkage void do_notify_resume(struct pt_regs *regs, - unsigned long thread_flags) -{ - /* - * The assembly code enters us with IRQs off, but it hasn't - * informed the tracing code of that for efficiency reasons. - * Update the trace code with the current status. - */ - trace_hardirqs_off(); - - do { - /* Check valid user FS if needed */ - addr_limit_user_check(); - - if (thread_flags & _TIF_NEED_RESCHED) { - /* Unmask Debug and SError for the next task */ - local_daif_restore(DAIF_PROCCTX_NOIRQ); - - schedule(); - } else { - local_daif_restore(DAIF_PROCCTX); - - if (thread_flags & _TIF_UPROBE) - uprobe_notify_resume(regs); - - if (thread_flags & _TIF_SIGPENDING) - do_signal(regs); - - if (thread_flags & _TIF_NOTIFY_RESUME) { - clear_thread_flag(TIF_NOTIFY_RESUME); - tracehook_notify_resume(regs); - rseq_handle_notify_resume(NULL, regs); - } - - if (thread_flags & _TIF_FOREIGN_FPSTATE) - fpsimd_restore_current_state(); - } - - local_daif_mask(); - thread_flags = READ_ONCE(current_thread_info()->flags); - } while (thread_flags & _TIF_WORK_MASK); -} - unsigned long __ro_after_init signal_minsigstksz; /* From patchwork Thu Sep 19 15:03:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152775 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 64B551599 for ; Thu, 19 Sep 2019 15:10:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 42A8E21927 for ; Thu, 19 Sep 2019 15:10:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390981AbfISPKA (ORCPT ); Thu, 19 Sep 2019 11:10:00 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50103 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403953AbfISPJ6 (ORCPT ); Thu, 19 Sep 2019 11:09:58 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy47-0006pT-V7; Thu, 19 Sep 2019 17:09:52 +0200 Message-Id: <20190919150809.754111224@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:27 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 13/15] arm64/entry: Move FPU restore out of exit_to_usermode() loop References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Restoring the FPU is not required to be insided the loop. The final point to do that is before returning to user space. Move it there. Signed-off-by: Thomas Gleixner --- arch/arm64/include/asm/entry-common.h | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) --- a/arch/arm64/include/asm/entry-common.h +++ b/arch/arm64/include/asm/entry-common.h @@ -7,8 +7,6 @@ #include -#define ARCH_EXIT_TO_USERMODE_WORK (_TIF_FOREIGN_FPSTATE) - static inline void local_irq_enable_exit_to_user(unsigned long ti_work) { if (ti_work & _TIF_NEED_RESCHED) @@ -24,15 +22,14 @@ static inline void local_irq_disable_exi } #define local_irq_disable_exit_to_user local_irq_disable_exit_to_user -static inline void arch_exit_to_usermode_work(struct pt_regs *regs, - unsigned long ti_work) +static inline void arch_exit_to_usermode(struct pt_regs *regs, + unsigned long ti_work) { - /* Must this be inside the work loop ? */ if (ti_work & _TIF_FOREIGN_FPSTATE) fpsimd_restore_current_state(); } -#define arch_exit_to_usermode_work arch_exit_to_usermode_work +#define arch_exit_to_usermode arch_exit_to_usermode enum ptrace_syscall_dir { PTRACE_SYSCALL_ENTER = 0, From patchwork Thu Sep 19 15:03:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152773 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 44D511599 for ; Thu, 19 Sep 2019 15:10:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2DCF721924 for ; Thu, 19 Sep 2019 15:10:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403960AbfISPKC (ORCPT ); Thu, 19 Sep 2019 11:10:02 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50105 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388199AbfISPKA (ORCPT ); Thu, 19 Sep 2019 11:10:00 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy49-0006pg-1u; Thu, 19 Sep 2019 17:09:53 +0200 Message-Id: <20190919150809.860645841@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:28 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 14/15] workpending: Provide infrastructure for work before entering a guest References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Entering a guest is similar to exiting to user space. Pending work like handling signals, rescheduling, task work etc. needs to be handled before that. Provide generic infrastructure to avoid duplication of the same handling code all over the place. Update ARM64 struct kvm_vcpu_stat with a signal_exit member so the generic code compiles. Signed-off-by: Thomas Gleixner --- arch/arm64/include/asm/kvm_host.h | 1 include/linux/entry-common.h | 66 ++++++++++++++++++++++++++++++++++++++ kernel/entry/common.c | 44 +++++++++++++++++++++++++ 3 files changed, 111 insertions(+) --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -409,6 +409,7 @@ struct kvm_vcpu_stat { u64 wfi_exit_stat; u64 mmio_exit_user; u64 mmio_exit_kernel; + u64 signal_exits; u64 exits; }; --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -255,4 +255,70 @@ static inline void arch_syscall_exit_tra /* Common syscall exit function */ void syscall_exit_to_usermode(struct pt_regs *regs, long syscall, long retval); +#if IS_ENABLED(CONFIG_KVM) + +#include + +#ifndef ARCH_EXIT_TO_GUESTMODE_WORK +# define ARCH_EXIT_TO_GUESTMODE_WORK (0) +#endif + +#define EXIT_TO_GUESTMODE_WORK \ + (_TIF_NEED_RESCHED | _TIF_SIGPENDING | _TIF_NOTIFY_RESUME | \ + ARCH_EXIT_TO_GUESTMODE_WORK) + +int core_exit_to_guestmode_work(struct kvm *kvm, struct kvm_vcpu *vcpu, + unsigned long ti_work); + +/** + * arch_exit_to_guestmode - Architecture specific exit to guest mode function + * @kvm: Pointer to the guest instance + * @vcpu: Pointer to current's VCPU data + * @ti_work: Cached TIF flags gathered in exit_to_guestmode() + * + * Invoked from core_exit_to_guestmode_work(). Can be replaced by + * architecture specific code. + */ +static inline int arch_exit_to_guestmode(struct kvm *kvm, struct kvm_vcpu *vcpu, + unsigned long ti_work); + +#ifndef arch_exit_to_guestmode +static inline int arch_exit_to_guestmode(struct kvm *kvm, struct kvm_vcpu *vcpu, + unsigned long ti_work) +{ + return 0; +} +#endif + +/** + * exit_to_guestmode - Check and handle pending work which needs to be + * handled before returning to guest mode + * @kvm: Pointer to the guest instance + * @vcpu: Pointer to current's VCPU data + * + * Returns: 0 or an error code + */ +static inline int exit_to_guestmode(struct kvm *kvm, struct kvm_vcpu *vcpu) +{ + unsigned long ti_work = READ_ONCE(current_thread_info()->flags); + + if (unlikely(ti_work & EXIT_TO_GUESTMODE_WORK)) + return core_exit_to_guestmode_work(kvm, vcpu, ti_work); + return 0; +} + + +/** + * _exit_to_guestmode_work_pending - Check if work is pending which needs to be + * handled before returning to guest mode + */ +static inline bool exit_to_guestmode_work_pending(void) +{ + unsigned long ti_work = READ_ONCE(current_thread_info()->flags); + + return !!(ti_work & EXIT_TO_GUESTMODE_WORK); + +} +#endif /* CONFIG_KVM */ + #endif --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -174,3 +174,47 @@ void syscall_exit_to_usermode(struct pt_ do_exit_to_usermode(regs); #endif } + +#if IS_ENABLED(CONFIG_KVM) +int __weak arch_exit_to_guestmode_work(struct kvm *kvm, struct kvm_vcpu *vcpu, + unsigned long ti_work) +{ + return 0; +} + +int core_exit_to_guestmode_work(struct kvm *kvm, struct kvm_vcpu *vcpu, + unsigned long ti_work) +{ + /* + * Before returning to guest mode handle all pending work + */ + if (ti_work & _TIF_SIGPENDING) { + vcpu->run->exit_reason = KVM_EXIT_INTR; + vcpu->stat.signal_exits++; + return -EINTR; + } + + if (ti_work & _TIF_NEED_RESCHED) { + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); + schedule(); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); + } + + if (ti_work & _TIF_PATCH_PENDING) { + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); + klp_update_patch_state(current); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); + } + + if (ti_work & _TIF_NOTIFY_RESUME) { + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); + clear_thread_flag(TIF_NOTIFY_RESUME); + tracehook_notify_resume(NULL); + vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); + } + + /* Any extra architecture specific work */ + return arch_exit_to_guestmode_work(kvm, vcpu, ti_work); +} +EXPORT_SYMBOL_GPL(core_exit_to_guestmode_work); +#endif From patchwork Thu Sep 19 15:03:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 11152779 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2FD8D14ED for ; Thu, 19 Sep 2019 15:10:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0E80121929 for ; Thu, 19 Sep 2019 15:10:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404016AbfISPKV (ORCPT ); Thu, 19 Sep 2019 11:10:21 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:50106 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390959AbfISPJ7 (ORCPT ); Thu, 19 Sep 2019 11:09:59 -0400 Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1iAy4A-0006pw-01; Thu, 19 Sep 2019 17:09:54 +0200 Message-Id: <20190919150809.964620570@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 19 Sep 2019 17:03:29 +0200 From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Peter Zijlstra , Andy Lutomirski , Catalin Marinas , Will Deacon , Mark Rutland , Marc Zyngier , Paolo Bonzini , kvm@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC patch 15/15] x86/kvm: Use GENERIC_EXIT_WORKPENDING References: <20190919150314.054351477@linutronix.de> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use the generic infrastructure to check for and handle pending work before entering into guest mode. Signed-off-by: Thomas Gleixner --- arch/x86/kvm/x86.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -52,6 +52,7 @@ #include #include #include +#include #include #include @@ -7984,8 +7985,8 @@ static int vcpu_enter_guest(struct kvm_v if (kvm_lapic_enabled(vcpu) && vcpu->arch.apicv_active) kvm_x86_ops->sync_pir_to_irr(vcpu); - if (vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) - || need_resched() || signal_pending(current)) { + if (vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) || + exit_to_guestmode_work_pending()) { vcpu->mode = OUTSIDE_GUEST_MODE; smp_wmb(); local_irq_enable(); @@ -8178,17 +8179,9 @@ static int vcpu_run(struct kvm_vcpu *vcp kvm_check_async_pf_completion(vcpu); - if (signal_pending(current)) { - r = -EINTR; - vcpu->run->exit_reason = KVM_EXIT_INTR; - ++vcpu->stat.signal_exits; + r = exit_to_guestmode(kvm, vcpu); + if (r) break; - } - if (need_resched()) { - srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); - cond_resched(); - vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); - } } srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);