diff mbox series

[RFC] arm64: Move HYP text out of kernel mapping

Message ID 20230210100006.1161696-1-ardb@kernel.org (mailing list archive)
State New, archived
Headers show
Series [RFC] arm64: Move HYP text out of kernel mapping | expand

Commit Message

Ard Biesheuvel Feb. 10, 2023, 10 a.m. UTC
The HYP text region contains the code that the hypervisor runs when
running KVM at EL2. This code is never called by the kernel running at
EL1, regardless of whether it booted at EL2 or whether it runs KVM in
VHE mode or not.

This means that this code has no need to be mapped with executable
permissions in the kernel's address space, and should therefore be
moved out of it. That way, any gadgets that may exist in this code are
no longer exploitable at the kernel's exception level (speculative or
otherwise).

Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org> 
Cc: Quentin Perret <qperret@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/vmlinux.lds.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

This change currently results in the following warnings*:

  (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x16c/0x570
  (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___deactivate_traps+0x40/0x144
  (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x380/0x570
  (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_hyp_panic+0x54/0xf8
  (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid_ipa+0xc0/0x1b8
  (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid+0x84/0x150
  (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_flush_cpu_context+0x84/0x150
  (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_handle_trap+0x80/0x128

The warnings are due to the fact that the jump label code refuses to
patch sections that are not kernel text.

So the questions are:
a) Mark pointed out off-list that he has been getting rid of static keys
   in favor of alternatives in the arch code, as those are guaranteed to
   be patched only once. Should we try to get rid of these as well?

b) These look like they are set only once and never turned off again.
   The pKVM one is definitely only set at boot time, but I couldn't
   figure out whether the same applies to the PMU one?

c) for Peter: could we relax this check (kernel/jump_label.c:446) to
   permit jump labels in .rodata as well?

(* after changing the WARN_ONCE() to pr_warn() and tweaking the output)

Comments

Marc Zyngier Feb. 10, 2023, 11:56 a.m. UTC | #1
On Fri, 10 Feb 2023 10:00:06 +0000,
Ard Biesheuvel <ardb@kernel.org> wrote:
> 
> The HYP text region contains the code that the hypervisor runs when
> running KVM at EL2. This code is never called by the kernel running at
> EL1, regardless of whether it booted at EL2 or whether it runs KVM in
> VHE mode or not.
> 
> This means that this code has no need to be mapped with executable
> permissions in the kernel's address space, and should therefore be
> moved out of it. That way, any gadgets that may exist in this code are
> no longer exploitable at the kernel's exception level (speculative or
> otherwise).

I *really* like this, as it also means that we get simply free this
code when running VHE or that EL2 isn't available at all (in a guest).

> 
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org> 
> Cc: Quentin Perret <qperret@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/kernel/vmlinux.lds.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> This change currently results in the following warnings*:
> 
>   (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x16c/0x570
>   (kvm_arm_pmu_available) can't patch jump_label at __kvm_nvhe___deactivate_traps+0x40/0x144
>   (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_vcpu_run+0x380/0x570
>   (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_hyp_panic+0x54/0xf8
>   (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid_ipa+0xc0/0x1b8
>   (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_tlb_flush_vmid+0x84/0x150
>   (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe___kvm_flush_cpu_context+0x84/0x150
>   (kvm_protected_mode_initialized) can't patch jump_label at __kvm_nvhe_handle_trap+0x80/0x128
> 
> The warnings are due to the fact that the jump label code refuses to
> patch sections that are not kernel text.
> 
> So the questions are:
> a) Mark pointed out off-list that he has been getting rid of static keys
>    in favor of alternatives in the arch code, as those are guaranteed to
>    be patched only once. Should we try to get rid of these as well?

The question is whether we can use these alternatives at such a late
point in the boot process. Today, we are done with the alternatives as
soon as all the early CPUs are up.

> b) These look like they are set only once and never turned off again.
>    The pKVM one is definitely only set at boot time, but I couldn't
>    figure out whether the same applies to the PMU one?

Yes, the PMU is in the same bag. As soon as we have found an
architectural PMU *and* that the driver has been registered, we're
good. But we cannot just rely on the CPU ID regs as the perf backend
could fail to register.

In both cases, this would be very late patching. Mark?

Thanks,

	M.
Mark Rutland Feb. 10, 2023, 4:40 p.m. UTC | #2
On Fri, Feb 10, 2023 at 11:56:01AM +0000, Marc Zyngier wrote:
> On Fri, 10 Feb 2023 10:00:06 +0000,
> Ard Biesheuvel <ardb@kernel.org> wrote:
> > So the questions are:
> > a) Mark pointed out off-list that he has been getting rid of static keys
> >    in favor of alternatives in the arch code, as those are guaranteed to
> >    be patched only once. Should we try to get rid of these as well?
> 
> The question is whether we can use these alternatives at such a late
> point in the boot process. Today, we are done with the alternatives as
> soon as all the early CPUs are up.

My thinking is that anything pKVM relies upon must be settled around that time
(and certainly before any late secondaries are onlined), so we should be able
to pull the few remaining bits and pieces a little earlier.

> > b) These look like they are set only once and never turned off again.
> >    The pKVM one is definitely only set at boot time, but I couldn't
> >    figure out whether the same applies to the PMU one?
> 
> Yes, the PMU is in the same bag. As soon as we have found an
> architectural PMU *and* that the driver has been registered, we're
> good. 

As above, I was hoping we could somehow pull that before patching.

> But we cannot just rely on the CPU ID regs as the perf backend
> could fail to register.

I thought pKVM just cared about homgeneity here, and was hiding the PMU state
from the host, so does it matter what the host does, and if the host fails to
register a perf backend?

It doesn't seem right that pKVM would rely upon the host to manage the PMU
given pKVM cannot trust the host...

Thanks,
Mark.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 1a43df27a20461ca..f42c070c3b4530c6 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -180,7 +180,6 @@  SECTIONS
 			CPUIDLE_TEXT
 			LOCK_TEXT
 			KPROBES_TEXT
-			HYPERVISOR_TEXT
 			*(.gnu.warning)
 		. = ALIGN(16);
 		*(.got)			/* Global offset table		*/
@@ -208,6 +207,7 @@  SECTIONS
 		HIBERNATE_TEXT
 		KEXEC_TEXT
 		IDMAP_TEXT
+		HYPERVISOR_TEXT
 		. = ALIGN(PAGE_SIZE);
 	}