diff mbox series

[v3,2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded

Message ID 20230415164029.526895-3-reijiw@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: PMU: Correct the handling of PMUSERENR_EL0 | expand

Commit Message

Reiji Watanabe April 15, 2023, 4:40 p.m. UTC
Currently, with VHE, KVM sets ER, CR, SW and EN bits of
PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores
the register value for the host on vcpu_load() and vcpu_put().
If the value of those bits are cleared on a pCPU with a vCPU
loaded (armv8pmu_start() would do that when PMU counters are
programmed for the guest), PMU access from the guest EL0 might
be trapped to the guest EL1 directly regardless of the current
PMUSERENR_EL0 value of the vCPU.

Fix this by not letting armv8pmu_start() overwrite PMUSERENR_EL0
on the pCPU where PMUSERENR_EL0 for the guest is loaded, and
instead updating the saved shadow register value for the host,
so that the value can be restored on vcpu_put() later.
While vcpu_{put,load}() are manipulating PMUSERENR_EL0, disable
IRQs to prevent a race condition between these processes and IPIs
that attempt to update PMUSERENR_EL0 for the host EL0.
As this change (disabling IRQs) is applied to the nVHE hyp code,
unwanted code (i.e. trace_hardirqs_off) will be included in the
hyp code when CONFIG_TRACE_IRQFLAGS is enabled.  Introduce
NO_TRACE_IRQFLAGS macro to locally disable CONFIG_TRACE_IRQFLAGS
in the nVHE hyp code.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Marc Zyngier <maz@kernel.org>
Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/include/asm/kvm_host.h       |  7 +++++++
 arch/arm64/kernel/perf_event.c          | 21 ++++++++++++++++++---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 24 ++++++++++++++++++++++++
 arch/arm64/kvm/hyp/nvhe/Makefile        |  2 +-
 arch/arm64/kvm/pmu.c                    | 25 +++++++++++++++++++++++++
 include/linux/irqflags.h                |  4 ++--
 6 files changed, 77 insertions(+), 6 deletions(-)

Comments

kernel test robot April 15, 2023, 10:09 p.m. UTC | #1
Hi Reiji,

kernel test robot noticed the following build errors:

[auto build test ERROR on 09a9639e56c01c7a00d6c0ca63f4c7c41abe075d]

url:    https://github.com/intel-lab-lkp/linux/commits/Reiji-Watanabe/KVM-arm64-PMU-Restore-the-host-s-PMUSERENR_EL0/20230416-004142
base:   09a9639e56c01c7a00d6c0ca63f4c7c41abe075d
patch link:    https://lore.kernel.org/r/20230415164029.526895-3-reijiw%40google.com
patch subject: [PATCH v3 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
config: arm64-allyesconfig (https://download.01.org/0day-ci/archive/20230416/202304160658.Oqr1xZbi-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/276e15e5db09003944d194da2b2577cff5192884
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Reiji-Watanabe/KVM-arm64-PMU-Restore-the-host-s-PMUSERENR_EL0/20230416-004142
        git checkout 276e15e5db09003944d194da2b2577cff5192884
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304160658.Oqr1xZbi-lkp@intel.com/

All errors (new ones prefixed by >>):

   aarch64-linux-ld: Unexpected GOT/PLT entries detected!
   aarch64-linux-ld: Unexpected run-time procedure linkages detected!
   aarch64-linux-ld: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.o: in function `__kvm_nvhe___activate_traps_common':
>> __kvm_nvhe_switch.c:(.hyp.text+0x14b4): undefined reference to `__kvm_nvhe_warn_bogus_irq_restore'
   aarch64-linux-ld: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.o: in function `__kvm_nvhe___deactivate_traps_common':
   __kvm_nvhe_switch.c:(.hyp.text+0x1f6c): undefined reference to `__kvm_nvhe_warn_bogus_irq_restore'
Reiji Watanabe April 16, 2023, 3:31 a.m. UTC | #2
On Sat, Apr 15, 2023 at 3:10 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Reiji,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on 09a9639e56c01c7a00d6c0ca63f4c7c41abe075d]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Reiji-Watanabe/KVM-arm64-PMU-Restore-the-host-s-PMUSERENR_EL0/20230416-004142
> base:   09a9639e56c01c7a00d6c0ca63f4c7c41abe075d
> patch link:    https://lore.kernel.org/r/20230415164029.526895-3-reijiw%40google.com
> patch subject: [PATCH v3 2/2] KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
> config: arm64-allyesconfig (https://download.01.org/0day-ci/archive/20230416/202304160658.Oqr1xZbi-lkp@intel.com/config)
> compiler: aarch64-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://github.com/intel-lab-lkp/linux/commit/276e15e5db09003944d194da2b2577cff5192884
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Reiji-Watanabe/KVM-arm64-PMU-Restore-the-host-s-PMUSERENR_EL0/20230416-004142
>         git checkout 276e15e5db09003944d194da2b2577cff5192884
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 olddefconfig
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
> | Link: https://lore.kernel.org/oe-kbuild-all/202304160658.Oqr1xZbi-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    aarch64-linux-ld: Unexpected GOT/PLT entries detected!
>    aarch64-linux-ld: Unexpected run-time procedure linkages detected!
>    aarch64-linux-ld: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.o: in function `__kvm_nvhe___activate_traps_common':
> >> __kvm_nvhe_switch.c:(.hyp.text+0x14b4): undefined reference to `__kvm_nvhe_warn_bogus_irq_restore'
>    aarch64-linux-ld: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.o: in function `__kvm_nvhe___deactivate_traps_common':
>    __kvm_nvhe_switch.c:(.hyp.text+0x1f6c): undefined reference to `__kvm_nvhe_warn_bogus_irq_restore'

It looks like this happens when CONFIG_DEBUG_IRQFLAGS is enabled.
I am going to introduce another flag to disable this as well.

Thanks,
Reiji
Oliver Upton May 25, 2023, 11:36 p.m. UTC | #3
Hi Reiji,

Apologies, this fell off my list of reviews.

On Sat, Apr 15, 2023 at 09:40:29AM -0700, Reiji Watanabe wrote:

[...]

>  static void armv8pmu_enable_event(struct perf_event *event)
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 6718731729fd..7e73be12cfaf 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -82,12 +82,24 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
>  	 */
>  	if (kvm_arm_support_pmu_v3()) {
>  		struct kvm_cpu_context *hctxt;
> +		unsigned long flags;
>  
>  		write_sysreg(0, pmselr_el0);
>  
>  		hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> +
> +		/*
> +		 * Disable IRQs to prevent a race condition between the
> +		 * following code and IPIs that attempts to update
> +		 * PMUSERENR_EL0. See also kvm_set_pmuserenr().
> +		 */
> +		local_irq_save(flags);
> +
>  		ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
>  		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
> +		vcpu_set_flag(vcpu, PMUSERENR_ON_CPU);
> +
> +		local_irq_restore(flags);

Can the IRQ save/restore be moved to {activate,deactivate}_traps_vhe_{load,put}()?

That'd eliminate the dance to avoid using kernel-only symbols in nVHE
and would be consistent with the existing usage of
__{activate,deactivate}_traps_common() from nVHE (IRQs already
disabled).

IMO, the less nVHE knows about the kernel the better.

>  	}
>  
>  	vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2);
> @@ -112,9 +124,21 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
>  	write_sysreg(0, hstr_el2);
>  	if (kvm_arm_support_pmu_v3()) {
>  		struct kvm_cpu_context *hctxt;
> +		unsigned long flags;
>  
>  		hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> +
> +		/*
> +		 * Disable IRQs to prevent a race condition between the
> +		 * following code and IPIs that attempts to update
> +		 * PMUSERENR_EL0. See also kvm_set_pmuserenr().
> +		 */
> +		local_irq_save(flags);
> +
>  		write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0);
> +		vcpu_clear_flag(vcpu, PMUSERENR_ON_CPU);
> +
> +		local_irq_restore(flags);
>  	}
>  
>  	if (cpus_have_final_cap(ARM64_SME)) {
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 530347cdebe3..2c08a54ca7d9 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -10,7 +10,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
>  # will explode instantly (Words of Marc Zyngier). So introduce a generic flag
>  # __DISABLE_TRACE_MMIO__ to disable MMIO tracing for nVHE KVM.
>  ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
> -ccflags-y += -fno-stack-protector	\
> +ccflags-y += -fno-stack-protector -DNO_TRACE_IRQFLAGS \
>  	     -DDISABLE_BRANCH_PROFILING	\
>  	     $(DISABLE_STACKLEAK_PLUGIN)
>  
> diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> index 7887133d15f0..d6a863853bfe 100644
> --- a/arch/arm64/kvm/pmu.c
> +++ b/arch/arm64/kvm/pmu.c
> @@ -209,3 +209,28 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
>  	kvm_vcpu_pmu_enable_el0(events_host);
>  	kvm_vcpu_pmu_disable_el0(events_guest);
>  }
> +
> +/*
> + * With VHE, keep track of the PMUSERENR_EL0 value for the host EL0 on the pCPU
> + * where PMUSERENR_EL0 for the guest is loaded, since PMUSERENR_EL0 is switched
> + * to the value for the guest on vcpu_load().  The value for the host EL0
> + * will be restored on vcpu_put(), before returning to the EL0.

wording: s/the EL0/EL0. Or, alternatively, to avoid repeating yourself
you can just say "returning to userspace".

You may also want to mention in passing why this isn't necessary for
nVHE, as the register is context switched for every guest enter/exit.

> + *
> + * Return true if KVM takes care of the register. Otherwise return false.
> + */
> +bool kvm_set_pmuserenr(u64 val)
> +{
> +	struct kvm_cpu_context *hctxt;
> +	struct kvm_vcpu *vcpu;
> +
> +	if (!kvm_arm_support_pmu_v3() || !has_vhe())
> +		return false;
> +
> +	vcpu = kvm_get_running_vcpu();
> +	if (!vcpu || !vcpu_get_flag(vcpu, PMUSERENR_ON_CPU))
> +		return false;
> +
> +	hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> +	ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
> +	return true;
> +}
Reiji Watanabe May 27, 2023, 4:16 a.m. UTC | #4
Hi Oliver,

On Thu, May 25, 2023 at 11:36:23PM +0000, Oliver Upton wrote:
> Hi Reiji,
> 
> Apologies, this fell off my list of reviews.
> 
> On Sat, Apr 15, 2023 at 09:40:29AM -0700, Reiji Watanabe wrote:
> 
> [...]
> 
> >  static void armv8pmu_enable_event(struct perf_event *event)
> > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > index 6718731729fd..7e73be12cfaf 100644
> > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > @@ -82,12 +82,24 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
> >  	 */
> >  	if (kvm_arm_support_pmu_v3()) {
> >  		struct kvm_cpu_context *hctxt;
> > +		unsigned long flags;
> >  
> >  		write_sysreg(0, pmselr_el0);
> >  
> >  		hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > +
> > +		/*
> > +		 * Disable IRQs to prevent a race condition between the
> > +		 * following code and IPIs that attempts to update
> > +		 * PMUSERENR_EL0. See also kvm_set_pmuserenr().
> > +		 */
> > +		local_irq_save(flags);
> > +
> >  		ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
> >  		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
> > +		vcpu_set_flag(vcpu, PMUSERENR_ON_CPU);
> > +
> > +		local_irq_restore(flags);
> 
> Can the IRQ save/restore be moved to {activate,deactivate}_traps_vhe_{load,put}()?
> 
> That'd eliminate the dance to avoid using kernel-only symbols in nVHE
> and would be consistent with the existing usage of
> __{activate,deactivate}_traps_common() from nVHE (IRQs already
> disabled).
> 
> IMO, the less nVHE knows about the kernel the better.

Thank you for the comments.
Sure, I will move them to {activate,deactivate}_traps_vhe_{load,put}().


> 
> >  	}
> >  
> >  	vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2);
> > @@ -112,9 +124,21 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
> >  	write_sysreg(0, hstr_el2);
> >  	if (kvm_arm_support_pmu_v3()) {
> >  		struct kvm_cpu_context *hctxt;
> > +		unsigned long flags;
> >  
> >  		hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > +
> > +		/*
> > +		 * Disable IRQs to prevent a race condition between the
> > +		 * following code and IPIs that attempts to update
> > +		 * PMUSERENR_EL0. See also kvm_set_pmuserenr().
> > +		 */
> > +		local_irq_save(flags);
> > +
> >  		write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0);
> > +		vcpu_clear_flag(vcpu, PMUSERENR_ON_CPU);
> > +
> > +		local_irq_restore(flags);
> >  	}
> >  
> >  	if (cpus_have_final_cap(ARM64_SME)) {
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 530347cdebe3..2c08a54ca7d9 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -10,7 +10,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
> >  # will explode instantly (Words of Marc Zyngier). So introduce a generic flag
> >  # __DISABLE_TRACE_MMIO__ to disable MMIO tracing for nVHE KVM.
> >  ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
> > -ccflags-y += -fno-stack-protector	\
> > +ccflags-y += -fno-stack-protector -DNO_TRACE_IRQFLAGS \
> >  	     -DDISABLE_BRANCH_PROFILING	\
> >  	     $(DISABLE_STACKLEAK_PLUGIN)
> >  
> > diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> > index 7887133d15f0..d6a863853bfe 100644
> > --- a/arch/arm64/kvm/pmu.c
> > +++ b/arch/arm64/kvm/pmu.c
> > @@ -209,3 +209,28 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
> >  	kvm_vcpu_pmu_enable_el0(events_host);
> >  	kvm_vcpu_pmu_disable_el0(events_guest);
> >  }
> > +
> > +/*
> > + * With VHE, keep track of the PMUSERENR_EL0 value for the host EL0 on the pCPU
> > + * where PMUSERENR_EL0 for the guest is loaded, since PMUSERENR_EL0 is switched
> > + * to the value for the guest on vcpu_load().  The value for the host EL0
> > + * will be restored on vcpu_put(), before returning to the EL0.
> 
> wording: s/the EL0/EL0. Or, alternatively, to avoid repeating yourself
> you can just say "returning to userspace".
> 
> You may also want to mention in passing why this isn't necessary for
> nVHE, as the register is context switched for every guest enter/exit.

Thank you for the suggestions. I will fix those.

Thank you,
Reiji


> 
> > + *
> > + * Return true if KVM takes care of the register. Otherwise return false.
> > + */
> > +bool kvm_set_pmuserenr(u64 val)
> > +{
> > +	struct kvm_cpu_context *hctxt;
> > +	struct kvm_vcpu *vcpu;
> > +
> > +	if (!kvm_arm_support_pmu_v3() || !has_vhe())
> > +		return false;
> > +
> > +	vcpu = kvm_get_running_vcpu();
> > +	if (!vcpu || !vcpu_get_flag(vcpu, PMUSERENR_ON_CPU))
> > +		return false;
> > +
> > +	hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > +	ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
> > +	return true;
> > +}
> 
> -- 
> Thanks,
> Oliver
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index bcd774d74f34..c49cfda2740a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -668,6 +668,8 @@  struct kvm_vcpu_arch {
 /* Software step state is Active-pending */
 #define DBG_SS_ACTIVE_PENDING	__vcpu_single_flag(sflags, BIT(5))
 
+/* PMUSERENR for the guest EL0 is on physical CPU */
+#define PMUSERENR_ON_CPU	__vcpu_single_flag(sflags, BIT(6))
 
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
 #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +	\
@@ -1028,9 +1030,14 @@  void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
 #ifdef CONFIG_KVM
 void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
 void kvm_clr_pmu_events(u32 clr);
+bool kvm_set_pmuserenr(u64 val);
 #else
 static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
 static inline void kvm_clr_pmu_events(u32 clr) {}
+static inline bool kvm_set_pmuserenr(u64 val)
+{
+	return false;
+}
 #endif
 
 void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index dde06c0f97f3..33bb5f548f8a 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -741,9 +741,25 @@  static inline u32 armv8pmu_getreset_flags(void)
 	return value;
 }
 
+static void update_pmuserenr(u64 val)
+{
+	lockdep_assert_irqs_disabled();
+
+	/*
+	 * The current PMUSERENR_EL0 value might be the value for the guest.
+	 * If that's the case, have KVM keep tracking of the register value
+	 * for the host EL0 so that KVM can restore it before returning to
+	 * the host EL0. Otherwise, update the register now.
+	 */
+	if (kvm_set_pmuserenr(val))
+		return;
+
+	write_sysreg(val, pmuserenr_el0);
+}
+
 static void armv8pmu_disable_user_access(void)
 {
-	write_sysreg(0, pmuserenr_el0);
+	update_pmuserenr(0);
 }
 
 static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
@@ -759,8 +775,7 @@  static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
 			armv8pmu_write_evcntr(i, 0);
 	}
 
-	write_sysreg(0, pmuserenr_el0);
-	write_sysreg(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR, pmuserenr_el0);
+	update_pmuserenr(ARMV8_PMU_USERENR_ER | ARMV8_PMU_USERENR_CR);
 }
 
 static void armv8pmu_enable_event(struct perf_event *event)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 6718731729fd..7e73be12cfaf 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -82,12 +82,24 @@  static inline void __activate_traps_common(struct kvm_vcpu *vcpu)
 	 */
 	if (kvm_arm_support_pmu_v3()) {
 		struct kvm_cpu_context *hctxt;
+		unsigned long flags;
 
 		write_sysreg(0, pmselr_el0);
 
 		hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+
+		/*
+		 * Disable IRQs to prevent a race condition between the
+		 * following code and IPIs that attempts to update
+		 * PMUSERENR_EL0. See also kvm_set_pmuserenr().
+		 */
+		local_irq_save(flags);
+
 		ctxt_sys_reg(hctxt, PMUSERENR_EL0) = read_sysreg(pmuserenr_el0);
 		write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0);
+		vcpu_set_flag(vcpu, PMUSERENR_ON_CPU);
+
+		local_irq_restore(flags);
 	}
 
 	vcpu->arch.mdcr_el2_host = read_sysreg(mdcr_el2);
@@ -112,9 +124,21 @@  static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu)
 	write_sysreg(0, hstr_el2);
 	if (kvm_arm_support_pmu_v3()) {
 		struct kvm_cpu_context *hctxt;
+		unsigned long flags;
 
 		hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+
+		/*
+		 * Disable IRQs to prevent a race condition between the
+		 * following code and IPIs that attempts to update
+		 * PMUSERENR_EL0. See also kvm_set_pmuserenr().
+		 */
+		local_irq_save(flags);
+
 		write_sysreg(ctxt_sys_reg(hctxt, PMUSERENR_EL0), pmuserenr_el0);
+		vcpu_clear_flag(vcpu, PMUSERENR_ON_CPU);
+
+		local_irq_restore(flags);
 	}
 
 	if (cpus_have_final_cap(ARM64_SME)) {
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 530347cdebe3..2c08a54ca7d9 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -10,7 +10,7 @@  asflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS
 # will explode instantly (Words of Marc Zyngier). So introduce a generic flag
 # __DISABLE_TRACE_MMIO__ to disable MMIO tracing for nVHE KVM.
 ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -D__DISABLE_EXPORTS -D__DISABLE_TRACE_MMIO__
-ccflags-y += -fno-stack-protector	\
+ccflags-y += -fno-stack-protector -DNO_TRACE_IRQFLAGS \
 	     -DDISABLE_BRANCH_PROFILING	\
 	     $(DISABLE_STACKLEAK_PLUGIN)
 
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 7887133d15f0..d6a863853bfe 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -209,3 +209,28 @@  void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
 	kvm_vcpu_pmu_enable_el0(events_host);
 	kvm_vcpu_pmu_disable_el0(events_guest);
 }
+
+/*
+ * With VHE, keep track of the PMUSERENR_EL0 value for the host EL0 on the pCPU
+ * where PMUSERENR_EL0 for the guest is loaded, since PMUSERENR_EL0 is switched
+ * to the value for the guest on vcpu_load().  The value for the host EL0
+ * will be restored on vcpu_put(), before returning to the EL0.
+ *
+ * Return true if KVM takes care of the register. Otherwise return false.
+ */
+bool kvm_set_pmuserenr(u64 val)
+{
+	struct kvm_cpu_context *hctxt;
+	struct kvm_vcpu *vcpu;
+
+	if (!kvm_arm_support_pmu_v3() || !has_vhe())
+		return false;
+
+	vcpu = kvm_get_running_vcpu();
+	if (!vcpu || !vcpu_get_flag(vcpu, PMUSERENR_ON_CPU))
+		return false;
+
+	hctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+	ctxt_sys_reg(hctxt, PMUSERENR_EL0) = val;
+	return true;
+}
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 5ec0fa71399e..f7fd5d645b52 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -198,9 +198,9 @@  extern void warn_bogus_irq_restore(void);
 
 /*
  * The local_irq_*() APIs are equal to the raw_local_irq*()
- * if !TRACE_IRQFLAGS.
+ * if !TRACE_IRQFLAGS or if NO_TRACE_IRQFLAGS is locally set.
  */
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined CONFIG_TRACE_IRQFLAGS && !defined(NO_TRACE_IRQFLAGS)
 
 #define local_irq_enable()				\
 	do {						\