From patchwork Mon Apr 14 11:11:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 14050242 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46E752980DE; Mon, 14 Apr 2025 11:39:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630779; cv=none; b=tSIZZkhwXYLpvFlDWDhfYeKKtuTwDHH75vzsBDAcOYwtIVh3rEiSPNWZcpwM4xuWTrKA+prXqFOlUZOHBt/O3QFWLjrc+2cB8u+konhxe9j0QIOl18fWsOn/wx0IN94iCby6BUCdx6SGfaKdR+3RPzJdrdVUFWL7TpwrPq9EmpA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630779; c=relaxed/simple; bh=lRCfpEWr6FyeoSTNoSc1ykFd37libaQ5cGVO+AdnIIA=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=EHGajNHc+CHvTeT3A+bhlYluXrtE0DqMIoTUfJyDgFcYj76wc6FDaJWDLhqjuDonqmQTwYB/h4kzYm5bXIDUHW8UzXfVtjB2DUXEkHXzSMjI8yXzEGBWSb39kzBKD0ZKAHLkOY0XuOuG/W1Up1WlbFjylgyetpblkIZKgh2obQk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=YUYVQxaQ; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="YUYVQxaQ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Dj6T/Jzv/d0QgHw5px9ErimwTM5Cy5tazeBNo6pdwA4=; b=YUYVQxaQ04j2kqhtvYg1juI5/D y3iKQut6w32ZCk3jIEKEGy7DwzW/ThMS/5AjFHPPCXOFTc28QKTvickSHBC9gsnyO9EXLN9tfoXoh ASJjtyzOVAxkDQdjFUG23eSxpDP9Zv9mMnGL8sGPfLKNu9k+mcNQYcGoCZNjAiuxBRH3bmpxLI8Jk vS5zgeTA6bpFWOjAhX5HExz5lBb/TMjMv2dB0RM+Yq363Vpm6CMPPAIR/XHqZAF/i4tY+MCl3M/CL c0AzZO0FE+KOjR8m7xGetXJoyFEonJ+OwQvZZpCnkUp500h06K6/EmAH+vUmxwpAUW5SggEZPavXa /MAvNkXw==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.1 #2 (Red Hat Linux)) id 1u4I9u-00000009fKh-3UyO; Mon, 14 Apr 2025 11:39:27 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 269973007A4; Mon, 14 Apr 2025 13:39:26 +0200 (CEST) Message-ID: <20250414113753.951654151@infradead.org> User-Agent: quilt/0.66 Date: Mon, 14 Apr 2025 13:11:41 +0200 From: Peter Zijlstra To: x86@kernel.org Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, jpoimboe@kernel.org, pawan.kumar.gupta@linux.intel.com, seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org, kees@kernel.org, Arnd Bergmann , gregkh@linuxfoundation.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, samitolvanen@google.com, ojeda@kernel.org Subject: [PATCH 1/6] x86/nospec: JMP_NOSPEC References: <20250414111140.586315004@infradead.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/nospec-branch.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -438,6 +438,9 @@ static inline void call_depth_return_thu #define CALL_NOSPEC __CS_PREFIX("%V[thunk_target]") \ "call __x86_indirect_thunk_%V[thunk_target]\n" +#define JMP_NOSPEC __CS_PREFIX("%V[thunk_target]") \ + "jmp __x86_indirect_thunk_%V[thunk_target]\n" + # define THUNK_TARGET(addr) [thunk_target] "r" (addr) #else /* CONFIG_X86_32 */ @@ -468,10 +471,31 @@ static inline void call_depth_return_thu "call *%[thunk_target]\n", \ X86_FEATURE_RETPOLINE_LFENCE) +# define JMP_NOSPEC \ + ALTERNATIVE_2( \ + ANNOTATE_RETPOLINE_SAFE \ + "jmp *%[thunk_target]\n", \ + " jmp 901f;\n" \ + " .align 16\n" \ + "901: call 903f;\n" \ + "902: pause;\n" \ + " lfence;\n" \ + " jmp 902b;\n" \ + " .align 16\n" \ + "903: lea 4(%%esp), %%esp;\n" \ + " pushl %[thunk_target];\n" \ + " ret;\n", \ + X86_FEATURE_RETPOLINE, \ + "lfence;\n" \ + ANNOTATE_RETPOLINE_SAFE \ + "jmp *%[thunk_target]\n", \ + X86_FEATURE_RETPOLINE_LFENCE) + # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) #endif #else /* No retpoline for C / inline asm */ # define CALL_NOSPEC "call *%[thunk_target]\n" +# define JMP_NOSPEC "jmp *%[thunk_target]\n" # define THUNK_TARGET(addr) [thunk_target] "rm" (addr) #endif From patchwork Mon Apr 14 11:11:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 14050247 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD92929CB4B; Mon, 14 Apr 2025 11:39:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630782; cv=none; b=np5ohpbDhzMsGX9MKmILSR8aFUKG/s8eNgtqQBAK3kCwOa0BMkkDZECGZM574uRy4nHU0i3dFYA4ENDy+ZtOuFnP21At1CndvMg3X4w/zd46ujnSCYuZUHVL1bZdostSADUwtLTW1VacsPYRtL4obhxEkqdUL42e+eidN6kjsNA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630782; c=relaxed/simple; bh=pcCiP5FTA+Lm/bobq6YiOZWGbXIV/H7h3fpmxDOHmBo=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=RdBBkNEaOXZdLDktwWSmI/FF9nOIg6yG9tCl3vonJvghiExM/vlNo/O8RTlLR5yA7ruOsALNCIHixLjECuWEzhvsR44tB2Q2mjbVjsyqWxCr7ZvpcRRq8C71C86Gj6yn4SC48pAv+Qaj5pCN95DDHJoSlUOw/Vi7bA9JR2xhjDA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=QoQL3hqs; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="QoQL3hqs" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Af6nUOSfZkjof68kUCgJ+wjJzCjJKMFbRfwFCJ9l/cg=; b=QoQL3hqsSj3gkdRkvHLFh78e+A mTih0+/n7JGn+dj+avJt7OZK3u6HQ34OY22w+lS50anSC2L2luJXadQoK2XSBJhLqFCV21ZT48o4w dUC7tC5XYhn6rRcrUzzDCmDbXpo+JXkWkMMAm3r5sIAluoH3MVwBEDOgmj7qn21MJYHtN0YksIxb+ jRpaSUHvauq6e5xJOTwfv2x4z0PftPJbCcoqRCS3jOse/PTyBrGQ22mwqb61yg8mH+7kUxv4el1yS mSOYs9srmYOdRm8usvxY0nYwgZ99SrvCGwgClDY5SMB+7wnhtjiRukIPkmKJelHhFRItvYsdJdnjX ZhDjvdAA==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1u4I9u-000000084Gs-0ohz; Mon, 14 Apr 2025 11:39:26 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 2A55D30082A; Mon, 14 Apr 2025 13:39:26 +0200 (CEST) Message-ID: <20250414113754.062619856@infradead.org> User-Agent: quilt/0.66 Date: Mon, 14 Apr 2025 13:11:42 +0200 From: Peter Zijlstra To: x86@kernel.org Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, jpoimboe@kernel.org, pawan.kumar.gupta@linux.intel.com, seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org, kees@kernel.org, Arnd Bergmann , gregkh@linuxfoundation.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, samitolvanen@google.com, ojeda@kernel.org Subject: [PATCH 2/6] x86/kvm/emulate: Implement test_cc() in C References: <20250414111140.586315004@infradead.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Current test_cc() uses the fastop infrastructure to test flags using SETcc instructions. However, int3_emulate_jcc() already fully implements the flags->CC mapping, use that. Removes a pile of gnarly asm. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/text-patching.h | 20 +++++++++++++------- arch/x86/kvm/emulate.c | 34 ++-------------------------------- 2 files changed, 15 insertions(+), 39 deletions(-) --- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h @@ -177,9 +177,9 @@ void int3_emulate_ret(struct pt_regs *re } static __always_inline -void int3_emulate_jcc(struct pt_regs *regs, u8 cc, unsigned long ip, unsigned long disp) +bool __emulate_cc(unsigned long flags, u8 cc) { - static const unsigned long jcc_mask[6] = { + static const unsigned long cc_mask[6] = { [0] = X86_EFLAGS_OF, [1] = X86_EFLAGS_CF, [2] = X86_EFLAGS_ZF, @@ -192,15 +192,21 @@ void int3_emulate_jcc(struct pt_regs *re bool match; if (cc < 0xc) { - match = regs->flags & jcc_mask[cc >> 1]; + match = flags & cc_mask[cc >> 1]; } else { - match = ((regs->flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^ - ((regs->flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT); + match = ((flags & X86_EFLAGS_SF) >> X86_EFLAGS_SF_BIT) ^ + ((flags & X86_EFLAGS_OF) >> X86_EFLAGS_OF_BIT); if (cc >= 0xe) - match = match || (regs->flags & X86_EFLAGS_ZF); + match = match || (flags & X86_EFLAGS_ZF); } - if ((match && !invert) || (!match && invert)) + return (match && !invert) || (!match && invert); +} + +static __always_inline +void int3_emulate_jcc(struct pt_regs *regs, u8 cc, unsigned long ip, unsigned long disp) +{ + if (__emulate_cc(regs->flags, cc)) ip += disp; int3_emulate_jmp(regs, ip); --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -26,6 +26,7 @@ #include #include #include +#include #include "x86.h" #include "tss.h" @@ -416,31 +417,6 @@ static int fastop(struct x86_emulate_ctx ON64(FOP3E(op##q, rax, rdx, cl)) \ FOP_END -/* Special case for SETcc - 1 instruction per cc */ -#define FOP_SETCC(op) \ - FOP_FUNC(op) \ - #op " %al \n\t" \ - FOP_RET(op) - -FOP_START(setcc) -FOP_SETCC(seto) -FOP_SETCC(setno) -FOP_SETCC(setc) -FOP_SETCC(setnc) -FOP_SETCC(setz) -FOP_SETCC(setnz) -FOP_SETCC(setbe) -FOP_SETCC(setnbe) -FOP_SETCC(sets) -FOP_SETCC(setns) -FOP_SETCC(setp) -FOP_SETCC(setnp) -FOP_SETCC(setl) -FOP_SETCC(setnl) -FOP_SETCC(setle) -FOP_SETCC(setnle) -FOP_END; - FOP_START(salc) FOP_FUNC(salc) "pushf; sbb %al, %al; popf \n\t" @@ -1068,13 +1044,7 @@ static int em_bsr_c(struct x86_emulate_c static __always_inline u8 test_cc(unsigned int condition, unsigned long flags) { - u8 rc; - void (*fop)(void) = (void *)em_setcc + FASTOP_SIZE * (condition & 0xf); - - flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF; - asm("push %[flags]; popf; " CALL_NOSPEC - : "=a"(rc), ASM_CALL_CONSTRAINT : [thunk_target]"r"(fop), [flags]"r"(flags)); - return rc; + return __emulate_cc(flags, condition & 0xf); } static void fetch_register_operand(struct operand *op) From patchwork Mon Apr 14 11:11:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 14050243 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A445D293B4D; Mon, 14 Apr 2025 11:39:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630780; cv=none; b=eCWM5mYiKgBrMbsEDPpsQclrwl56hjOOJRwIjIYQzrbMyljLaXoGW6hxkSrIQSsmD1u2UybcoSLrncLmk4yGRmDF2JNDaXGX83O2We4kIELDCe6DxXLNZAwYXS1tYmEwZxsjd2twgec38xNdvp8ZV0fOFb25mSqpz1R7sp4lURU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630780; c=relaxed/simple; bh=p/Y84bBNhhNyI/fMooDzHs4hcoxAN/Gk2IldqfwkQH8=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=tJSctaguiLOg+tqYTfG+Dwm5ZnGEVK0sHI0tBnRxahHco0DZBcseeIRdMUG7VQ27gcmp60NwkhUUrP7uWKSr4WzPcrH9Va2N/T5ow4fn1J3Eq8WNUEu7NjU6V1anEW9eFngHR7APkinXvEw84FuG96fEx+b3L2wSH7zRH5Jt11U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=n5G/BKsd; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="n5G/BKsd" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=giIf0y3F3Tm6yHGVfgBN4P6bvuQNR4i30PNgCAncNzs=; b=n5G/BKsdfWJ9WLNUxR/0EOZLse mqFKJGkT8oRE8p98iK99m5eCN3hBmjWy5Aj49NqMoXx4Qs8jT+gWeIdSIhxkHqUZCbE/+OEvAHOkc SQliEqqKrPVeO8XoQFHdFqp7bp+vcxVWCiWrQWAVghFB9NwKz67/T2bFdwCK+zgX5fYEj+FJSse2Y NRnYbjP03jiT3LVnLQ6Q6MymyGreLpSyh7oTlrJIbZmgIo9Ce5m5mtuN68GQnoWeazyhAi9JzNpK5 zrBhcgkNo4imSv/0TQw+pNWUW5L33KmTb8eh8UpYujRMsulM+t9TJeIsI2tzYHPHEmG0W8iKLpM2d zjr1/w8g==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1u4I9u-000000084Gr-0oXW; Mon, 14 Apr 2025 11:39:26 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 2DD5D300B40; Mon, 14 Apr 2025 13:39:26 +0200 (CEST) Message-ID: <20250414113754.172767741@infradead.org> User-Agent: quilt/0.66 Date: Mon, 14 Apr 2025 13:11:43 +0200 From: Peter Zijlstra To: x86@kernel.org Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, jpoimboe@kernel.org, pawan.kumar.gupta@linux.intel.com, seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org, kees@kernel.org, Arnd Bergmann , gregkh@linuxfoundation.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, samitolvanen@google.com, ojeda@kernel.org Subject: [PATCH 3/6] x86/kvm/emulate: Avoid RET for fastops References: <20250414111140.586315004@infradead.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Since there is only a single fastop() function, convert the FASTOP stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return thunks and all that jazz. Specifically FASTOPs rely on the return thunk to preserve EFLAGS, which not all of them can trivially do (call depth tracing suffers here). Objtool strenuously complains about things, therefore fix up the various problems: - indirect call without a .rodata, fails to determine JUMP_TABLE, add an annotation for this. - fastop functions fall through, create an exception for this case - unreachable instruction after fastop_return, save/restore Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/kvm/emulate.c | 20 +++++++++++++++----- include/linux/objtool_types.h | 1 + tools/include/linux/objtool_types.h | 1 + tools/objtool/check.c | 11 ++++++++++- 4 files changed, 27 insertions(+), 6 deletions(-) --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -285,8 +285,8 @@ static void invalidate_registers(struct * different operand sizes can be reached by calculation, rather than a jump * table (which would be bigger than the code). * - * The 16 byte alignment, considering 5 bytes for the RET thunk, 3 for ENDBR - * and 1 for the straight line speculation INT3, leaves 7 bytes for the + * The 16 byte alignment, considering 5 bytes for the JMP, 4 for ENDBR + * and 1 for the straight line speculation INT3, leaves 6 bytes for the * body of the function. Currently none is larger than 4. */ static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop); @@ -304,7 +304,7 @@ static int fastop(struct x86_emulate_ctx __FOP_FUNC(#name) #define __FOP_RET(name) \ - "11: " ASM_RET \ + "11: jmp fastop_return; int3 \n\t" \ ".size " name ", .-" name "\n\t" #define FOP_RET(name) \ @@ -5044,14 +5044,24 @@ static void fetch_possible_mmx_operand(s kvm_read_mmx_reg(op->addr.mm, &op->mm_val); } -static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop) +/* + * All the FASTOP magic above relies on there being *one* instance of this + * so it can JMP back, avoiding RET and it's various thunks. + */ +static noinline int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop) { ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF; if (!(ctxt->d & ByteOp)) fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE; - asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n" + asm("push %[flags]; popf \n\t" + UNWIND_HINT(UNWIND_HINT_TYPE_SAVE, 0, 0, 0) + ASM_ANNOTATE(ANNOTYPE_JUMP_TABLE) + JMP_NOSPEC + "fastop_return: \n\t" + UNWIND_HINT(UNWIND_HINT_TYPE_RESTORE, 0, 0, 0) + "pushf; pop %[flags]\n" : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags), [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT : "c"(ctxt->src2.val)); --- a/include/linux/objtool_types.h +++ b/include/linux/objtool_types.h @@ -65,5 +65,6 @@ struct unwind_hint { #define ANNOTYPE_IGNORE_ALTS 6 #define ANNOTYPE_INTRA_FUNCTION_CALL 7 #define ANNOTYPE_REACHABLE 8 +#define ANNOTYPE_JUMP_TABLE 9 #endif /* _LINUX_OBJTOOL_TYPES_H */ --- a/tools/include/linux/objtool_types.h +++ b/tools/include/linux/objtool_types.h @@ -65,5 +65,6 @@ struct unwind_hint { #define ANNOTYPE_IGNORE_ALTS 6 #define ANNOTYPE_INTRA_FUNCTION_CALL 7 #define ANNOTYPE_REACHABLE 8 +#define ANNOTYPE_JUMP_TABLE 9 #endif /* _LINUX_OBJTOOL_TYPES_H */ --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -2428,6 +2428,14 @@ static int __annotate_late(struct objtoo insn->dead_end = false; break; + /* + * Must be after add_jump_table(); for it doesn't set a sane + * _jump_table value. + */ + case ANNOTYPE_JUMP_TABLE: + insn->_jump_table = (void *)1; + break; + default: ERROR_INSN(insn, "Unknown annotation type: %d", type); return -1; @@ -3559,7 +3567,8 @@ static int validate_branch(struct objtoo if (func && insn_func(insn) && func != insn_func(insn)->pfunc) { /* Ignore KCFI type preambles, which always fall through */ if (!strncmp(func->name, "__cfi_", 6) || - !strncmp(func->name, "__pfx_", 6)) + !strncmp(func->name, "__pfx_", 6) || + !strcmp(insn_func(insn)->name, "fastop")) return 0; if (file->ignore_unreachables) From patchwork Mon Apr 14 11:11:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 14050245 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 644A7298CB1; Mon, 14 Apr 2025 11:39:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630781; cv=none; b=fe74uArk7kkAgCHHW+Nqb/6NhhtHbvhhFQJ/Ie7bAvA16f9LeT9LSumsjSJjRPjcMsGF5VRW/Ty2pEJbCe4kRbu2PTd15bEMp7d56cB587jWVwk0r9eIBUZTwFgK0nntaYr7ZHWVMTEoUxckxjmr080GwtluMItOMkmizW2Skww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630781; c=relaxed/simple; bh=PHl8kNn4XMqWLuqpGyLbAqHfeKubkKzsSFBNa/91MZE=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=ZmMq0kXCT3zwLtDrUvr1W8OI+LKeR9dqrJgBxavLvV0FoypSufExF+GlD0eb3GCQwJuIG3ZCmpEX4QwbpDw+IhYajwbHQiaNvRCcenTp/lyVSZcpZAJD5l/wcjbvmzy1c3l5apfiu69WSylV5gBLl8eDTLgGz5DuRS+OcJnpH3E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=oB5Q05PS; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="oB5Q05PS" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ukqRMJVYiAaOtcsNKoLFjTy2/0JlyMpnmN7EyAFcDIc=; b=oB5Q05PS6kg0RrO3XSs4x7nLzj UKLv+Qr3d+FsRyrLGjD0GAoyZOhl5/tt4Yrp4v9ITh4TxWx+QW+ZiuvRn/D8gCaopMpQH3SvybU7X lCrNItZ4umY4QzrX51s5rDHLwmddffOKBefqy1DQjAfdS/jP9YZunyKfVwOCBu6ojTYoH7o6hVvdn GmA4Z2MJ5qSSO4vw08OOFuj3SL0wrkhCqHGpWFQRgZ22j5I/KsxJOz04FqhSfIQXIBe0hu1W1DDpy 0k/xcZVUByfQOOzwZUaxMpF+Jofd+rxrEPOejSV3iPT2Ha99oq8jGB6G5gc6yJl+eBkukpL+PvGop g762Pgxw==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1u4I9u-000000084Gq-0mQo; Mon, 14 Apr 2025 11:39:26 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 3127A300BDD; Mon, 14 Apr 2025 13:39:26 +0200 (CEST) Message-ID: <20250414113754.285564821@infradead.org> User-Agent: quilt/0.66 Date: Mon, 14 Apr 2025 13:11:44 +0200 From: Peter Zijlstra To: x86@kernel.org Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, jpoimboe@kernel.org, pawan.kumar.gupta@linux.intel.com, seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org, kees@kernel.org, Arnd Bergmann , gregkh@linuxfoundation.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, samitolvanen@google.com, ojeda@kernel.org Subject: [PATCH 4/6] x86,hyperv: Clean up hv_do_hypercall() References: <20250414111140.586315004@infradead.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 What used to be a simple few instructions has turned into a giant mess (for x86_64). Not only does it use static_branch wrong, it mixes it with dynamic branches for no apparent reason. Notably it uses static_branch through an out-of-line function call, which completely defeats the purpose, since instead of a simple JMP/NOP site, you get a CALL+RET+TEST+Jcc sequence in return, which is absolutely idiotic. Add to that a dynamic test of hyperv_paravisor_present, something which is set once and never changed. Replace all this idiocy with a single direct function call to the right hypercall variant. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/hyperv/hv_init.c | 21 ++++++ arch/x86/hyperv/ivm.c | 14 ++++ arch/x86/include/asm/mshyperv.h | 137 +++++++++++----------------------------- arch/x86/kernel/cpu/mshyperv.c | 18 +++-- 4 files changed, 88 insertions(+), 102 deletions(-) --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -35,7 +35,28 @@ #include void *hv_hypercall_pg; + +#ifdef CONFIG_X86_64 +u64 hv_pg_hypercall(u64 control, u64 param1, u64 param2) +{ + u64 hv_status; + + if (!hv_hypercall_pg) + return U64_MAX; + + register u64 __r8 asm("r8") = param2; + asm volatile (CALL_NOSPEC + : "=a" (hv_status), ASM_CALL_CONSTRAINT, + "+c" (control), "+d" (param1) + : "r" (__r8), + THUNK_TARGET(hv_hypercall_pg) + : "cc", "memory", "r9", "r10", "r11"); + + return hv_status; +} +#else EXPORT_SYMBOL_GPL(hv_hypercall_pg); +#endif union hv_ghcb * __percpu *hv_ghcb_pg; --- a/arch/x86/hyperv/ivm.c +++ b/arch/x86/hyperv/ivm.c @@ -376,6 +376,20 @@ int hv_snp_boot_ap(u32 cpu, unsigned lon return ret; } +u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2) +{ + u64 hv_status; + + register u64 __r8 asm("r8") = param2; + asm volatile("vmmcall" + : "=a" (hv_status), ASM_CALL_CONSTRAINT, + "+c" (control), "+d" (param1) + : "r" (__r8) + : "cc", "memory", "r9", "r10", "r11"); + + return hv_status; +} + #else static inline void hv_ghcb_msr_write(u64 msr, u64 value) {} static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {} --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -39,15 +40,20 @@ static inline unsigned char hv_get_nmi_r } #if IS_ENABLED(CONFIG_HYPERV) -extern bool hyperv_paravisor_present; - extern void *hv_hypercall_pg; extern union hv_ghcb * __percpu *hv_ghcb_pg; bool hv_isolation_type_snp(void); bool hv_isolation_type_tdx(void); -u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2); + +#ifdef CONFIG_X86_64 +extern u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2); +extern u64 hv_snp_hypercall(u64 control, u64 param1, u64 param2); +extern u64 hv_pg_hypercall(u64 control, u64 param1, u64 param2); + +DECLARE_STATIC_CALL(hv_hypercall, hv_pg_hypercall); +#endif /* * DEFAULT INIT GPAT and SEGMENT LIMIT value in struct VMSA @@ -64,37 +70,15 @@ static inline u64 hv_do_hypercall(u64 co { u64 input_address = input ? virt_to_phys(input) : 0; u64 output_address = output ? virt_to_phys(output) : 0; - u64 hv_status; #ifdef CONFIG_X86_64 - if (hv_isolation_type_tdx() && !hyperv_paravisor_present) - return hv_tdx_hypercall(control, input_address, output_address); - - if (hv_isolation_type_snp() && !hyperv_paravisor_present) { - __asm__ __volatile__("mov %[output_address], %%r8\n" - "vmmcall" - : "=a" (hv_status), ASM_CALL_CONSTRAINT, - "+c" (control), "+d" (input_address) - : [output_address] "r" (output_address) - : "cc", "memory", "r8", "r9", "r10", "r11"); - return hv_status; - } - - if (!hv_hypercall_pg) - return U64_MAX; - - __asm__ __volatile__("mov %[output_address], %%r8\n" - CALL_NOSPEC - : "=a" (hv_status), ASM_CALL_CONSTRAINT, - "+c" (control), "+d" (input_address) - : [output_address] "r" (output_address), - THUNK_TARGET(hv_hypercall_pg) - : "cc", "memory", "r8", "r9", "r10", "r11"); + return static_call_mod(hv_hypercall)(control, input_address, output_address); #else u32 input_address_hi = upper_32_bits(input_address); u32 input_address_lo = lower_32_bits(input_address); u32 output_address_hi = upper_32_bits(output_address); u32 output_address_lo = lower_32_bits(output_address); + u64 hv_status; if (!hv_hypercall_pg) return U64_MAX; @@ -107,8 +91,8 @@ static inline u64 hv_do_hypercall(u64 co "D"(output_address_hi), "S"(output_address_lo), THUNK_TARGET(hv_hypercall_pg) : "cc", "memory"); -#endif /* !x86_64 */ return hv_status; +#endif /* !x86_64 */ } /* Hypercall to the L0 hypervisor */ @@ -120,41 +104,23 @@ static inline u64 hv_do_nested_hypercall /* Fast hypercall with 8 bytes of input and no output */ static inline u64 _hv_do_fast_hypercall8(u64 control, u64 input1) { - u64 hv_status; - #ifdef CONFIG_X86_64 - if (hv_isolation_type_tdx() && !hyperv_paravisor_present) - return hv_tdx_hypercall(control, input1, 0); - - if (hv_isolation_type_snp() && !hyperv_paravisor_present) { - __asm__ __volatile__( - "vmmcall" - : "=a" (hv_status), ASM_CALL_CONSTRAINT, - "+c" (control), "+d" (input1) - :: "cc", "r8", "r9", "r10", "r11"); - } else { - __asm__ __volatile__(CALL_NOSPEC - : "=a" (hv_status), ASM_CALL_CONSTRAINT, - "+c" (control), "+d" (input1) - : THUNK_TARGET(hv_hypercall_pg) - : "cc", "r8", "r9", "r10", "r11"); - } + return static_call_mod(hv_hypercall)(control, input1, 0); #else - { - u32 input1_hi = upper_32_bits(input1); - u32 input1_lo = lower_32_bits(input1); - - __asm__ __volatile__ (CALL_NOSPEC - : "=A"(hv_status), - "+c"(input1_lo), - ASM_CALL_CONSTRAINT - : "A" (control), - "b" (input1_hi), - THUNK_TARGET(hv_hypercall_pg) - : "cc", "edi", "esi"); - } -#endif + u32 input1_hi = upper_32_bits(input1); + u32 input1_lo = lower_32_bits(input1); + u64 hv_status; + + __asm__ __volatile__ (CALL_NOSPEC + : "=A"(hv_status), + "+c"(input1_lo), + ASM_CALL_CONSTRAINT + : "A" (control), + "b" (input1_hi), + THUNK_TARGET(hv_hypercall_pg) + : "cc", "edi", "esi"); return hv_status; +#endif } static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1) @@ -174,45 +140,24 @@ static inline u64 hv_do_fast_nested_hype /* Fast hypercall with 16 bytes of input */ static inline u64 _hv_do_fast_hypercall16(u64 control, u64 input1, u64 input2) { - u64 hv_status; - #ifdef CONFIG_X86_64 - if (hv_isolation_type_tdx() && !hyperv_paravisor_present) - return hv_tdx_hypercall(control, input1, input2); - - if (hv_isolation_type_snp() && !hyperv_paravisor_present) { - __asm__ __volatile__("mov %[input2], %%r8\n" - "vmmcall" - : "=a" (hv_status), ASM_CALL_CONSTRAINT, - "+c" (control), "+d" (input1) - : [input2] "r" (input2) - : "cc", "r8", "r9", "r10", "r11"); - } else { - __asm__ __volatile__("mov %[input2], %%r8\n" - CALL_NOSPEC - : "=a" (hv_status), ASM_CALL_CONSTRAINT, - "+c" (control), "+d" (input1) - : [input2] "r" (input2), - THUNK_TARGET(hv_hypercall_pg) - : "cc", "r8", "r9", "r10", "r11"); - } + return static_call_mod(hv_hypercall)(control, input1, input2); #else - { - u32 input1_hi = upper_32_bits(input1); - u32 input1_lo = lower_32_bits(input1); - u32 input2_hi = upper_32_bits(input2); - u32 input2_lo = lower_32_bits(input2); - - __asm__ __volatile__ (CALL_NOSPEC - : "=A"(hv_status), - "+c"(input1_lo), ASM_CALL_CONSTRAINT - : "A" (control), "b" (input1_hi), - "D"(input2_hi), "S"(input2_lo), - THUNK_TARGET(hv_hypercall_pg) - : "cc"); - } -#endif + u32 input1_hi = upper_32_bits(input1); + u32 input1_lo = lower_32_bits(input1); + u32 input2_hi = upper_32_bits(input2); + u32 input2_lo = lower_32_bits(input2); + u64 hv_status; + + __asm__ __volatile__ (CALL_NOSPEC + : "=A"(hv_status), + "+c"(input1_lo), ASM_CALL_CONSTRAINT + : "A" (control), "b" (input1_hi), + "D"(input2_hi), "S"(input2_lo), + THUNK_TARGET(hv_hypercall_pg) + : "cc"); return hv_status; +#endif } static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2) --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -37,10 +37,6 @@ bool hv_nested; struct ms_hyperv_info ms_hyperv; -/* Used in modules via hv_do_hypercall(): see arch/x86/include/asm/mshyperv.h */ -bool hyperv_paravisor_present __ro_after_init; -EXPORT_SYMBOL_GPL(hyperv_paravisor_present); - #if IS_ENABLED(CONFIG_HYPERV) static inline unsigned int hv_get_nested_msr(unsigned int reg) { @@ -287,6 +283,11 @@ static void __init x86_setup_ops_for_tsc old_restore_sched_clock_state = x86_platform.restore_sched_clock_state; x86_platform.restore_sched_clock_state = hv_restore_sched_clock_state; } + +#ifdef CONFIG_X86_64 +DEFINE_STATIC_CALL(hv_hypercall, hv_pg_hypercall); +EXPORT_STATIC_CALL_TRAMP_GPL(hv_hypercall); +#endif #endif /* CONFIG_HYPERV */ static uint32_t __init ms_hyperv_platform(void) @@ -483,14 +484,16 @@ static void __init ms_hyperv_init_platfo ms_hyperv.shared_gpa_boundary = BIT_ULL(ms_hyperv.shared_gpa_boundary_bits); - hyperv_paravisor_present = !!ms_hyperv.paravisor_present; - pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n", ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b); if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) { static_branch_enable(&isolation_type_snp); +#if defined(CONFIG_AMD_MEM_ENCRYPT) && defined(CONFIG_HYPERV) + if (!ms_hyperv.paravisor_present) + static_call_update(hv_hypercall, hv_snp_hypercall); +#endif } else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) { static_branch_enable(&isolation_type_tdx); @@ -498,6 +501,9 @@ static void __init ms_hyperv_init_platfo ms_hyperv.hints &= ~HV_X64_APIC_ACCESS_RECOMMENDED; if (!ms_hyperv.paravisor_present) { +#if defined(CONFIG_INTEL_TDX_GUEST) && defined(CONFIG_HYPERV) + static_call_update(hv_hypercall, hv_tdx_hypercall); +#endif /* * Mark the Hyper-V TSC page feature as disabled * in a TDX VM without paravisor so that the From patchwork Mon Apr 14 11:11:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 14050244 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6454F298CBE; Mon, 14 Apr 2025 11:39:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630781; cv=none; b=OLYovIDmggpFQgk4ghjxWDZzg2RQYsjf5wzVjNjHEpErce3CYOVKK7SgJyWpbiQZcnM5ie14juQt67H7H+dQgXC3LpqbPVTv7Giq7K3nPdx1NukyJ/OkQbpx4ZNrRp0RqxwR9F+bgSIWsOaux1D3cLRqKkdX0geoOlL8+lkk2AM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630781; c=relaxed/simple; bh=2pHvcytICOR27CCZF4HGChWq0sHLLAVds6wjhxK7xlg=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=rFL7jMtcdCITkJku34oaG2qLVkH04Darcv3Y3jmw5yacunCqLyEv8bIbAKQTkb0uoKiOpV/aIJAWWrVbdY1sKgL9aYhgVmULD64MYsRKXXYoaWwfzoiGwSJ6xaLj1W1pgeospf2B7eCgMJTkeJ88n9S8P7f4r11fktcR/lhc/ec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=d7q2p+/P; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="d7q2p+/P" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Tw2TGqdS/SbI3jUKF6I+o/3BGdMLR7QI7YMxJmOUkxk=; b=d7q2p+/PO/GjuQwPyxkgFZV2K/ rkOnryXJbtWrLlFXOK25iGBhIECC5VsXNJNjjkbTYPn/jIk3mALF7RdBf57n8qBH3T0a9wfBYuMgM oOeAyAUfFO/S87lEf06YYtK9uIj5NvSwfnkxKE2cT67WjMEznLHM49ZYzGLE3XOoQNj1qZXh29g7E bkDGUZnh2Bodp0x5+6Oi4R3GCLuIWYiRxTvIYnOuhthZHO8Xev2+vwn8QZl7oYFYrGswrdmgpKw+P gHhc3bWXpcHptM+te9GjOl3MkhcDC7YN3A0tCQlNdss5RrrC4t62BKTq2b2OZT4cD+S1+4xfP99IL zktLbPXw==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1u4I9u-000000084H6-3UmO; Mon, 14 Apr 2025 11:39:27 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 3487D301171; Mon, 14 Apr 2025 13:39:26 +0200 (CEST) Message-ID: <20250414113754.435282530@infradead.org> User-Agent: quilt/0.66 Date: Mon, 14 Apr 2025 13:11:45 +0200 From: Peter Zijlstra To: x86@kernel.org Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, jpoimboe@kernel.org, pawan.kumar.gupta@linux.intel.com, seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org, kees@kernel.org, Arnd Bergmann , gregkh@linuxfoundation.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, samitolvanen@google.com, ojeda@kernel.org Subject: [PATCH 5/6] x86_64,hyperv: Use direct call to hypercall-page References: <20250414111140.586315004@infradead.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Instead of using an indirect call to the hypercall page, use a direct call instead. This avoids all CFI problems, including the one where the hypercall page doesn't have IBT on. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/hyperv/hv_init.c | 62 +++++++++++++++++++++++----------------------- 1 file changed, 31 insertions(+), 31 deletions(-) --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -37,24 +37,42 @@ void *hv_hypercall_pg; #ifdef CONFIG_X86_64 +static u64 __hv_hyperfail(u64 control, u64 param1, u64 param2) +{ + return U64_MAX; +} + +DEFINE_STATIC_CALL(__hv_hypercall, __hv_hyperfail); + u64 hv_pg_hypercall(u64 control, u64 param1, u64 param2) { u64 hv_status; - if (!hv_hypercall_pg) - return U64_MAX; - register u64 __r8 asm("r8") = param2; - asm volatile (CALL_NOSPEC + asm volatile ("call " STATIC_CALL_TRAMP_STR(__hv_hypercall) : "=a" (hv_status), ASM_CALL_CONSTRAINT, "+c" (control), "+d" (param1) - : "r" (__r8), - THUNK_TARGET(hv_hypercall_pg) + : "r" (__r8) : "cc", "memory", "r9", "r10", "r11"); return hv_status; } + +typedef u64 (*hv_hypercall_f)(u64 control, u64 param1, u64 param2); + +static inline void hv_set_hypercall_pg(void *ptr) +{ + hv_hypercall_pg = ptr; + + if (!ptr) + ptr = &__hv_hyperfail; + static_call_update(__hv_hypercall, (hv_hypercall_f)ptr); +} #else +static inline void hv_set_hypercall_pg(void *ptr) +{ + hv_hypercall_pg = ptr; +} EXPORT_SYMBOL_GPL(hv_hypercall_pg); #endif @@ -349,7 +367,7 @@ static int hv_suspend(void) * pointer is restored on resume. */ hv_hypercall_pg_saved = hv_hypercall_pg; - hv_hypercall_pg = NULL; + hv_set_hypercall_pg(NULL); /* Disable the hypercall page in the hypervisor */ rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); @@ -375,7 +393,7 @@ static void hv_resume(void) vmalloc_to_pfn(hv_hypercall_pg_saved); wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); - hv_hypercall_pg = hv_hypercall_pg_saved; + hv_set_hypercall_pg(hv_hypercall_pg_saved); hv_hypercall_pg_saved = NULL; /* @@ -529,8 +547,8 @@ void __init hyperv_init(void) if (hv_isolation_type_tdx() && !ms_hyperv.paravisor_present) goto skip_hypercall_pg_init; - hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, - VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_ROX, + hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, MODULES_VADDR, + MODULES_END, GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS, NUMA_NO_NODE, __builtin_return_address(0)); if (hv_hypercall_pg == NULL) @@ -568,27 +586,9 @@ void __init hyperv_init(void) wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); } -skip_hypercall_pg_init: - /* - * Some versions of Hyper-V that provide IBT in guest VMs have a bug - * in that there's no ENDBR64 instruction at the entry to the - * hypercall page. Because hypercalls are invoked via an indirect call - * to the hypercall page, all hypercall attempts fail when IBT is - * enabled, and Linux panics. For such buggy versions, disable IBT. - * - * Fixed versions of Hyper-V always provide ENDBR64 on the hypercall - * page, so if future Linux kernel versions enable IBT for 32-bit - * builds, additional hypercall page hackery will be required here - * to provide an ENDBR32. - */ -#ifdef CONFIG_X86_KERNEL_IBT - if (cpu_feature_enabled(X86_FEATURE_IBT) && - *(u32 *)hv_hypercall_pg != gen_endbr()) { - setup_clear_cpu_cap(X86_FEATURE_IBT); - pr_warn("Disabling IBT because of Hyper-V bug\n"); - } -#endif + hv_set_hypercall_pg(hv_hypercall_pg); +skip_hypercall_pg_init: /* * hyperv_init() is called before LAPIC is initialized: see * apic_intr_mode_init() -> x86_platform.apic_post_init() and @@ -658,7 +658,7 @@ void hyperv_cleanup(void) * let hypercall operations fail safely rather than * panic the kernel for using invalid hypercall page */ - hv_hypercall_pg = NULL; + hv_set_hypercall_pg(NULL); /* Reset the hypercall page */ hypercall_msr.as_uint64 = hv_get_msr(HV_X64_MSR_HYPERCALL); From patchwork Mon Apr 14 11:11:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 14050241 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0189229617A; Mon, 14 Apr 2025 11:39:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630778; cv=none; b=qwuGcuS/DithplOVFrFflDi8XFg80+np3+DAn/cYZY6kFO0zwwyOnm8J16V9DqRHg+0FCCumcGeEsFp5mH6wYzq6Vc01Hw6kDmIz9Vz/xMLsEcOEk89izkrYHb3CVB7BqLKmQxB8/VQpO9J5FdeyuWNVHV+5VCYKU+mpO9haLo4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744630778; c=relaxed/simple; bh=Ts6M3uwmEcf0nxVZByaZFCjJeVdYaGTB4wLygRG7khM=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=rpBJep/UqymQFU6SnqHHjzYwmpWXOzwSwtrw1LNC2Y90nGIFCUL32WenqWoB4M+65tuvG6G+cUQQgRel6LnHp/o315pLa38reUnWnnLnvLLj092hIo92PBy8x0uYRbO/L7esvRksuI/wLBE2maHZOhbO0oJAdt39AZ/0o1caqt8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=dqtW97mt; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="dqtW97mt" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=NERy+dpirG5HXq/uyAPT5n83ufEh61Nz/XhqyMRTQq8=; b=dqtW97mtT/lmt6C1tXpzMlETw/ kos6YrHRd2KrTgLrTVZ5SPLkxhYKrjcWkWPg+FA65gI3f2+naEbc+tr2cE0eNRfxrLoOI3RjENr75 NkqDwCajFKFMsbwUUWvSo8SdK31vNvE+evayPoZLpwcecT1zb0HnouolDsyPveD+bNd9EhiHYHZyC lL9RqZzkW41w621SQXZqxeGpm9FimJ4/Mif1k76tU9YYM/ku35qrMc4ZMPJ+7CZcOfVG3b7FrxfMO pBNUxlCreyu3dbIBOYGJxUniyBOaEx2Fw/U4ITiuh8u40hFRoRiCydZw/mrypKDVmVucL5wzB5ENP X1TH9zAQ==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.1 #2 (Red Hat Linux)) id 1u4I9v-00000009fKj-2kkF; Mon, 14 Apr 2025 11:39:27 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id 37BC330119C; Mon, 14 Apr 2025 13:39:26 +0200 (CEST) Message-ID: <20250414113754.540779611@infradead.org> User-Agent: quilt/0.66 Date: Mon, 14 Apr 2025 13:11:46 +0200 From: Peter Zijlstra To: x86@kernel.org Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, jpoimboe@kernel.org, pawan.kumar.gupta@linux.intel.com, seanjc@google.com, pbonzini@redhat.com, ardb@kernel.org, kees@kernel.org, Arnd Bergmann , gregkh@linuxfoundation.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, samitolvanen@google.com, ojeda@kernel.org Subject: [PATCH 6/6] objtool: Validate kCFI calls References: <20250414111140.586315004@infradead.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Validate that all indirect calls adhere to kCFI rules. Notably doing nocfi indirect call to a cfi function is broken. Apparently some Rust 'core' code violates this and explodes when ran with FineIBT. All the ANNOTATE_NOCFI sites are prime targets for attackers. - runtime EFI is especially henous because it also needs to disable IBT. Basically calling unknown code without CFI protection at runtime is a massice security issue. - Kexec image handover; if you can exploit this, you get to keep it :-) - KVM, once for the interrupt injection calling IDT gates directly. - KVM, once for the FASTOP emulation stuff. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/kernel/machine_kexec_64.c | 1 arch/x86/kvm/emulate.c | 4 +++ arch/x86/kvm/vmx/vmenter.S | 4 +++ arch/x86/platform/efi/efi_stub_64.S | 4 +++ drivers/misc/lkdtm/perms.c | 2 + include/linux/objtool.h | 3 ++ include/linux/objtool_types.h | 1 tools/include/linux/objtool_types.h | 1 tools/objtool/check.c | 41 ++++++++++++++++++++++++++++++++++++ tools/objtool/include/objtool/elf.h | 1 10 files changed, 62 insertions(+) --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -442,6 +442,7 @@ void __nocfi machine_kexec(struct kimage __ftrace_enabled_restore(save_ftrace_enabled); } +ANNOTATE_NOCFI_SYM(machine_kexec); /* arch-dependent functionality related to kexec file-based syscall */ --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -5071,6 +5071,10 @@ static noinline int fastop(struct x86_em return emulate_de(ctxt); return X86EMUL_CONTINUE; } +/* + * The ASM stubs don't have CFI on. + */ +ANNOTATE_NOCFI_SYM(fastop); void init_decode_cache(struct x86_emulate_ctxt *ctxt) { --- a/arch/x86/kvm/vmx/vmenter.S +++ b/arch/x86/kvm/vmx/vmenter.S @@ -363,5 +363,9 @@ SYM_FUNC_END(vmread_error_trampoline) .section .text, "ax" SYM_FUNC_START(vmx_do_interrupt_irqoff) + /* + * Calling an IDT gate directly. + */ + ANNOTATE_NOCFI VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1 SYM_FUNC_END(vmx_do_interrupt_irqoff) --- a/arch/x86/platform/efi/efi_stub_64.S +++ b/arch/x86/platform/efi/efi_stub_64.S @@ -11,6 +11,10 @@ #include SYM_FUNC_START(__efi_call) + /* + * The EFI code doesn't have any CFI :-( + */ + ANNOTATE_NOCFI pushq %rbp movq %rsp, %rbp and $~0xf, %rsp --- a/drivers/misc/lkdtm/perms.c +++ b/drivers/misc/lkdtm/perms.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -86,6 +87,7 @@ static noinline __nocfi void execute_loc func(); pr_err("FAIL: func returned\n"); } +ANNOTATE_NOCFI_SYM(execute_location); static void execute_user_location(void *dst) { --- a/include/linux/objtool.h +++ b/include/linux/objtool.h @@ -185,6 +185,8 @@ */ #define ANNOTATE_REACHABLE(label) __ASM_ANNOTATE(label, ANNOTYPE_REACHABLE) +#define ANNOTATE_NOCFI_SYM(sym) asm(__ASM_ANNOTATE(sym, ANNOTYPE_NOCFI)) + #else #define ANNOTATE_NOENDBR ANNOTATE type=ANNOTYPE_NOENDBR #define ANNOTATE_RETPOLINE_SAFE ANNOTATE type=ANNOTYPE_RETPOLINE_SAFE @@ -194,6 +196,7 @@ #define ANNOTATE_INTRA_FUNCTION_CALL ANNOTATE type=ANNOTYPE_INTRA_FUNCTION_CALL #define ANNOTATE_UNRET_BEGIN ANNOTATE type=ANNOTYPE_UNRET_BEGIN #define ANNOTATE_REACHABLE ANNOTATE type=ANNOTYPE_REACHABLE +#define ANNOTATE_NOCFI ANNOTATE type=ANNOTYPE_NOCFI #endif #if defined(CONFIG_NOINSTR_VALIDATION) && \ --- a/include/linux/objtool_types.h +++ b/include/linux/objtool_types.h @@ -66,5 +66,6 @@ struct unwind_hint { #define ANNOTYPE_INTRA_FUNCTION_CALL 7 #define ANNOTYPE_REACHABLE 8 #define ANNOTYPE_JUMP_TABLE 9 +#define ANNOTYPE_NOCFI 10 #endif /* _LINUX_OBJTOOL_TYPES_H */ --- a/tools/include/linux/objtool_types.h +++ b/tools/include/linux/objtool_types.h @@ -66,5 +66,6 @@ struct unwind_hint { #define ANNOTYPE_INTRA_FUNCTION_CALL 7 #define ANNOTYPE_REACHABLE 8 #define ANNOTYPE_JUMP_TABLE 9 +#define ANNOTYPE_NOCFI 10 #endif /* _LINUX_OBJTOOL_TYPES_H */ --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -2387,6 +2387,8 @@ static int __annotate_ifc(struct objtool static int __annotate_late(struct objtool_file *file, int type, struct instruction *insn) { + struct symbol *sym; + switch (type) { case ANNOTYPE_NOENDBR: /* early */ @@ -2436,6 +2438,15 @@ static int __annotate_late(struct objtoo insn->_jump_table = (void *)1; break; + case ANNOTYPE_NOCFI: + sym = insn->sym; + if (!sym) { + WARN_INSN(insn, "dodgy NOCFI annotation"); + break; + } + insn->sym->nocfi = 1; + break; + default: ERROR_INSN(insn, "Unknown annotation type: %d", type); return -1; @@ -4006,6 +4017,36 @@ static int validate_retpoline(struct obj warnings++; } + if (!opts.cfi) + return warnings; + + /* + * kCFI call sites look like: + * + * movl $(-0x12345678), %r10d + * addl -4(%r11), %r10d + * jz 1f + * ud2 + * 1: cs call __x86_indirect_thunk_r11 + * + * Verify all indirect calls are kCFI adorned by checking for the + * UD2. Notably, doing __nocfi calls to regular (cfi) functions is + * broken. + */ + list_for_each_entry(insn, &file->retpoline_call_list, call_node) { + struct symbol *sym = insn->sym; + + if (sym && sym->type == STT_FUNC && !sym->nocfi) { + struct instruction *prev = + prev_insn_same_sym(file, insn); + + if (!prev || prev->type != INSN_BUG) { + WARN_INSN(insn, "no-cfi indirect call!"); + warnings++; + } + } + } + return warnings; } --- a/tools/objtool/include/objtool/elf.h +++ b/tools/objtool/include/objtool/elf.h @@ -70,6 +70,7 @@ struct symbol { u8 local_label : 1; u8 frame_pointer : 1; u8 ignore : 1; + u8 nocfi : 1; struct list_head pv_target; struct reloc *relocs; };