diff mbox series

[v3] KVM: arm64: Use BTI for nvhe

Message ID 20230530150845.2856828-1-smostafa@google.com (mailing list archive)
State New, archived
Headers show
Series [v3] KVM: arm64: Use BTI for nvhe | expand

Commit Message

Mostafa Saleh May 30, 2023, 3:08 p.m. UTC
CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
However, the nvhe code doesn't make use of it as it doesn't map any
pages with Guarded Page(GP) bit.

kvm pgtable code is modified to map executable pages with GP bit
if BTI is enabled for the kernel.

At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
(SCTLR_EL1.BT1) set in bti_enable().

One difference between kernel and nvhe code, is that the kernel maps
.text with GP while nvhe maps all the executable pages, this makes
nvhe code need to deal with special initialization code coming from
other executable sections (.idmap.text).
For this we need to add bti instruction at the beginning of
__kvm_handle_stub_hvc as it can be called by  __host_hvc through
branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
doesn’t add bti instruction at the beginning, and it can’t be modified
to add it as it is used with vector tables.
Another solution which is more intrusive is to convert
__kvm_handle_stub_hvc to a function and inject “bti jc” instead of
“bti c” in SYM_FUNC_START

Signed-off-by: Mostafa Saleh <smostafa@google.com>
---
v2 -> v3:
- Map all executable pages with GP bit instead of just .text, this
  simplifies the code and avoids hacks in pgtable code.
v1 -> v2:
- Enable BTI for nvhe also.
- Only set GP bit for executable pages from pgtable code.
- Set SCTLR_EL2.BT when BTI is used.
- use system_supports_bti() for consistency.
- Add hyp_init_valid_leaf_pte.
v1: https://lore.kernel.org/all/20230516141846.792193-1-smostafa@google.com/
---
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kvm/hyp/nvhe/hyp-init.S | 12 ++++++++++++
 arch/arm64/kvm/hyp/pgtable.c       |  7 ++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

Comments

Oliver Upton May 30, 2023, 6:01 p.m. UTC | #1
On Tue, 30 May 2023 15:08:45 +0000, Mostafa Saleh wrote:
> CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> However, the nvhe code doesn't make use of it as it doesn't map any
> pages with Guarded Page(GP) bit.
> 
> kvm pgtable code is modified to map executable pages with GP bit
> if BTI is enabled for the kernel.
> 
> [...]

Applied to kvmarm/next, thanks!

[1/1] KVM: arm64: Use BTI for nvhe
      https://git.kernel.org/kvmarm/kvmarm/c/b53d4a272349

--
Best,
Oliver
Sudeep Holla July 4, 2023, 1:41 p.m. UTC | #2
On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> However, the nvhe code doesn't make use of it as it doesn't map any
> pages with Guarded Page(GP) bit.
> 
> kvm pgtable code is modified to map executable pages with GP bit
> if BTI is enabled for the kernel.
> 
> At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
> (SCTLR_EL1.BT1) set in bti_enable().
> 
> One difference between kernel and nvhe code, is that the kernel maps
> .text with GP while nvhe maps all the executable pages, this makes
> nvhe code need to deal with special initialization code coming from
> other executable sections (.idmap.text).
> For this we need to add bti instruction at the beginning of
> __kvm_handle_stub_hvc as it can be called by  __host_hvc through
> branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
> doesn’t add bti instruction at the beginning, and it can’t be modified
> to add it as it is used with vector tables.
> Another solution which is more intrusive is to convert
> __kvm_handle_stub_hvc to a function and inject “bti jc” instead of
> “bti c” in SYM_FUNC_START
>

I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
cpuidle enabled. The system fails to boot. I just bisected the issue to this
patch and also saw this patch landed in the linus tree yesterday/today.
Not sure if this is something to do with the fact that pKVM skips to
__kvm_handle_stub_hvc in __host_hvc.

Let me know if you want be to try something.
Mostafa Saleh July 4, 2023, 2:18 p.m. UTC | #3
Hi Sudeep,

On Tue, Jul 04, 2023 at 02:41:36PM +0100, Sudeep Holla wrote:
> On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> > CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> > However, the nvhe code doesn't make use of it as it doesn't map any
> > pages with Guarded Page(GP) bit.
> > 
> > kvm pgtable code is modified to map executable pages with GP bit
> > if BTI is enabled for the kernel.
> > 
> > At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
> > (SCTLR_EL1.BT1) set in bti_enable().
> > 
> > One difference between kernel and nvhe code, is that the kernel maps
> > .text with GP while nvhe maps all the executable pages, this makes
> > nvhe code need to deal with special initialization code coming from
> > other executable sections (.idmap.text).
> > For this we need to add bti instruction at the beginning of
> > __kvm_handle_stub_hvc as it can be called by  __host_hvc through
> > branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
> > doesn’t add bti instruction at the beginning, and it can’t be modified
> > to add it as it is used with vector tables.
> > Another solution which is more intrusive is to convert
> > __kvm_handle_stub_hvc to a function and inject “bti jc” instead of
> > “bti c” in SYM_FUNC_START
> >
> 
> I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> cpuidle enabled. The system fails to boot. I just bisected the issue to this
> patch and also saw this patch landed in the linus tree yesterday/today.

One of the challenges of BTI is that we need to add explicit BTI instructions
for assembly code. I checked the code to make sure that nothing was missing,
but maybe this is not the case.
Can you please share more about the issue (is ESR a Branch Target Exception,
call stack...) if possible.
Also, is this with CONFIG_ARM_PSCI_CPUIDLE?

> Not sure if this is something to do with the fact that pKVM skips to
> __kvm_handle_stub_hvc in __host_hvc.
__kvm_handle_stub_hvc is called from __host_hvc with "br x5"
That's why "bti j" was added at the beginning of __kvm_handle_stub_hvc,
so this should be fine.

> Let me know if you want be to try something.
> 
> -- 
> Regards,
> Sudeep

Thanks,
Mostafa
Sudeep Holla July 4, 2023, 2:33 p.m. UTC | #4
Hi Mostafa,

On Tue, Jul 04, 2023 at 02:18:09PM +0000, Mostafa Saleh wrote:
> Hi Sudeep,
> 
> On Tue, Jul 04, 2023 at 02:41:36PM +0100, Sudeep Holla wrote:
> > On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> > > CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> > > However, the nvhe code doesn't make use of it as it doesn't map any
> > > pages with Guarded Page(GP) bit.
> > > 
> > > kvm pgtable code is modified to map executable pages with GP bit
> > > if BTI is enabled for the kernel.
> > > 
> > > At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
> > > (SCTLR_EL1.BT1) set in bti_enable().
> > > 
> > > One difference between kernel and nvhe code, is that the kernel maps
> > > .text with GP while nvhe maps all the executable pages, this makes
> > > nvhe code need to deal with special initialization code coming from
> > > other executable sections (.idmap.text).
> > > For this we need to add bti instruction at the beginning of
> > > __kvm_handle_stub_hvc as it can be called by  __host_hvc through
> > > branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
> > > doesn’t add bti instruction at the beginning, and it can’t be modified
> > > to add it as it is used with vector tables.
> > > Another solution which is more intrusive is to convert
> > > __kvm_handle_stub_hvc to a function and inject “bti jc” instead of
> > > “bti c” in SYM_FUNC_START
> > >
> > 
> > I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> > cpuidle enabled. The system fails to boot. I just bisected the issue to this
> > patch and also saw this patch landed in the linus tree yesterday/today.
> 
> One of the challenges of BTI is that we need to add explicit BTI instructions
> for assembly code. I checked the code to make sure that nothing was missing,
> but maybe this is not the case.
> Can you please share more about the issue (is ESR a Branch Target Exception,
> call stack...) if possible.

I haven't debugged it any further, just reported it as soon as I bisected it.
Reverting this get back the booting system. I am not sure if anything is going
wrong when the CPU is entering suspend(highly unlikely in normal scenario but
I am not so sure with pKVM trapping these PSCI calls now) or when it is woken
up and resuming back. IIUC this now will happen via kvm_hyp_cpu_resume->
__kvm_hyp_init_cpu->___kvm_hyp_init. 

> Also, is this with CONFIG_ARM_PSCI_CPUIDLE?

Yes, basically the cpus can enter cpu_suspend which IIUC pKVM traps and
handle for the host.

> 
> > Not sure if this is something to do with the fact that pKVM skips to
> > __kvm_handle_stub_hvc in __host_hvc.

Sorry, my bad. I meant pKVM skips calling __kvm_handle_stub_hvc in __host_hvc
and jumps to __host_exit directly. Sorry for that, one wrong "to" changed the
whole meaning.

> __kvm_handle_stub_hvc is called from __host_hvc with "br x5"
> That's why "bti j" was added at the beginning of __kvm_handle_stub_hvc,
> so this should be fine.
>

Yes I saw that and understood that but I wanted to tell the above which went
horribly wrong before.
Mostafa Saleh July 4, 2023, 4:27 p.m. UTC | #5
Hi Sudeep,

On Tue, Jul 04, 2023 at 03:33:39PM +0100, Sudeep Holla wrote:
> Hi Mostafa,
> 
> On Tue, Jul 04, 2023 at 02:18:09PM +0000, Mostafa Saleh wrote:
> > Hi Sudeep,
> > 
> > On Tue, Jul 04, 2023 at 02:41:36PM +0100, Sudeep Holla wrote:
> > > On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> > > > CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> > > > However, the nvhe code doesn't make use of it as it doesn't map any
> > > > pages with Guarded Page(GP) bit.
> > > > 
> > > > kvm pgtable code is modified to map executable pages with GP bit
> > > > if BTI is enabled for the kernel.
> > > > 
> > > > At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
> > > > (SCTLR_EL1.BT1) set in bti_enable().
> > > > 
> > > > One difference between kernel and nvhe code, is that the kernel maps
> > > > .text with GP while nvhe maps all the executable pages, this makes
> > > > nvhe code need to deal with special initialization code coming from
> > > > other executable sections (.idmap.text).
> > > > For this we need to add bti instruction at the beginning of
> > > > __kvm_handle_stub_hvc as it can be called by  __host_hvc through
> > > > branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
> > > > doesn’t add bti instruction at the beginning, and it can’t be modified
> > > > to add it as it is used with vector tables.
> > > > Another solution which is more intrusive is to convert
> > > > __kvm_handle_stub_hvc to a function and inject “bti jc” instead of
> > > > “bti c” in SYM_FUNC_START
> > > >
> > > 
> > > I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> > > cpuidle enabled. The system fails to boot. I just bisected the issue to this
> > > patch and also saw this patch landed in the linus tree yesterday/today.
> > 
> > One of the challenges of BTI is that we need to add explicit BTI instructions
> > for assembly code. I checked the code to make sure that nothing was missing,
> > but maybe this is not the case.
> > Can you please share more about the issue (is ESR a Branch Target Exception,
> > call stack...) if possible.
> 
> I haven't debugged it any further, just reported it as soon as I bisected it.
> Reverting this get back the booting system. I am not sure if anything is going
> wrong when the CPU is entering suspend(highly unlikely in normal scenario but
> I am not so sure with pKVM trapping these PSCI calls now) or when it is woken
> up and resuming back. IIUC this now will happen via kvm_hyp_cpu_resume->
> __kvm_hyp_init_cpu->___kvm_hyp_init. 

Thanks a lot for the information.

I checked this now, and I believe I found an issue. I see that __kvm_hyp_init_cpu
calls kvm_host_psci_cpu_entry indirectly and there is no BTI there.
I think this is the only C function that needs special handling.

Can you please check if this solves the issue?

diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
index c87c63133e10..7df63f364c3c 100644
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -297,3 +297,8 @@ SYM_CODE_START(__kvm_hyp_host_forward_smc)
 
        ret
 SYM_CODE_END(__kvm_hyp_host_forward_smc)
+
+SYM_CODE_START(kvm_host_psci_cpu_entry)
+       bti j
+       b __kvm_host_psci_cpu_entry
+SYM_CODE_END(kvm_host_psci_cpu_entry)
diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
index 08508783ec3d..24543d2a3490 100644
--- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
+++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
@@ -200,7 +200,7 @@ static int psci_system_suspend(u64 func_id, struct kvm_cpu_context *host_ctxt)
                         __hyp_pa(init_params), 0);
 }
 
-asmlinkage void __noreturn kvm_host_psci_cpu_entry(bool is_cpu_on)
+asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on)
 {
        struct psci_boot_args *boot_args;
        struct kvm_cpu_context *host_ctxt;


> > Also, is this with CONFIG_ARM_PSCI_CPUIDLE?
> 
> Yes, basically the cpus can enter cpu_suspend which IIUC pKVM traps and
> handle for the host.

My current setup has no CONFIG_ARM_PSCI_CPUIDLE?, I will try to find
something I can test with.

> > 
> > > Not sure if this is something to do with the fact that pKVM skips to
> > > __kvm_handle_stub_hvc in __host_hvc.
> 
> Sorry, my bad. I meant pKVM skips calling __kvm_handle_stub_hvc in __host_hvc
> and jumps to __host_exit directly. Sorry for that, one wrong "to" changed the
> whole meaning.

I don't see an issue in this, as this path has no indirect branches.

> > __kvm_handle_stub_hvc is called from __host_hvc with "br x5"
> > That's why "bti j" was added at the beginning of __kvm_handle_stub_hvc,
> > so this should be fine.
> >
> 
> Yes I saw that and understood that but I wanted to tell the above which went
> horribly wrong before.
> 

Thanks,
Mostafa
Sudeep Holla July 4, 2023, 7:25 p.m. UTC | #6
On Tue, Jul 04, 2023 at 04:27:04PM +0000, Mostafa Saleh wrote:
> Hi Sudeep,
> 
> On Tue, Jul 04, 2023 at 03:33:39PM +0100, Sudeep Holla wrote:
> > Hi Mostafa,
> > 
> > On Tue, Jul 04, 2023 at 02:18:09PM +0000, Mostafa Saleh wrote:
> > > Hi Sudeep,
> > > 
> > > On Tue, Jul 04, 2023 at 02:41:36PM +0100, Sudeep Holla wrote:
> > > > On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> > > > > CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> > > > > However, the nvhe code doesn't make use of it as it doesn't map any
> > > > > pages with Guarded Page(GP) bit.
> > > > > 
> > > > > kvm pgtable code is modified to map executable pages with GP bit
> > > > > if BTI is enabled for the kernel.
> > > > > 
> > > > > At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
> > > > > (SCTLR_EL1.BT1) set in bti_enable().
> > > > > 
> > > > > One difference between kernel and nvhe code, is that the kernel maps
> > > > > .text with GP while nvhe maps all the executable pages, this makes
> > > > > nvhe code need to deal with special initialization code coming from
> > > > > other executable sections (.idmap.text).
> > > > > For this we need to add bti instruction at the beginning of
> > > > > __kvm_handle_stub_hvc as it can be called by  __host_hvc through
> > > > > branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
> > > > > doesn’t add bti instruction at the beginning, and it can’t be modified
> > > > > to add it as it is used with vector tables.
> > > > > Another solution which is more intrusive is to convert
> > > > > __kvm_handle_stub_hvc to a function and inject “bti jc” instead of
> > > > > “bti c” in SYM_FUNC_START
> > > > >
> > > > 
> > > > I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> > > > cpuidle enabled. The system fails to boot. I just bisected the issue to this
> > > > patch and also saw this patch landed in the linus tree yesterday/today.
> > > 
> > > One of the challenges of BTI is that we need to add explicit BTI instructions
> > > for assembly code. I checked the code to make sure that nothing was missing,
> > > but maybe this is not the case.
> > > Can you please share more about the issue (is ESR a Branch Target Exception,
> > > call stack...) if possible.
> > 
> > I haven't debugged it any further, just reported it as soon as I bisected it.
> > Reverting this get back the booting system. I am not sure if anything is going
> > wrong when the CPU is entering suspend(highly unlikely in normal scenario but
> > I am not so sure with pKVM trapping these PSCI calls now) or when it is woken
> > up and resuming back. IIUC this now will happen via kvm_hyp_cpu_resume->
> > __kvm_hyp_init_cpu->___kvm_hyp_init. 
> 
> Thanks a lot for the information.
> 
> I checked this now, and I believe I found an issue. I see that __kvm_hyp_init_cpu
> calls kvm_host_psci_cpu_entry indirectly and there is no BTI there.
> I think this is the only C function that needs special handling.
>

So it is in the wake up path. Thanks for the description, now I understand
the issue and fix better.

> Can you please check if this solves the issue?
>

Yes, the below patch fixed the issue. Feel free to add when you post the
formal patch.

Reported-and-Tested-by: Sudeep Holla <sudeep.holla@arm.com>

> diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
> index c87c63133e10..7df63f364c3c 100644
> --- a/arch/arm64/kvm/hyp/nvhe/host.S
> +++ b/arch/arm64/kvm/hyp/nvhe/host.S
> @@ -297,3 +297,8 @@ SYM_CODE_START(__kvm_hyp_host_forward_smc)
>  
>         ret
>  SYM_CODE_END(__kvm_hyp_host_forward_smc)
> +
> +SYM_CODE_START(kvm_host_psci_cpu_entry)
> +       bti j
> +       b __kvm_host_psci_cpu_entry
> +SYM_CODE_END(kvm_host_psci_cpu_entry)
> diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> index 08508783ec3d..24543d2a3490 100644
> --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> @@ -200,7 +200,7 @@ static int psci_system_suspend(u64 func_id, struct kvm_cpu_context *host_ctxt)
>                          __hyp_pa(init_params), 0);
>  }
>  
> -asmlinkage void __noreturn kvm_host_psci_cpu_entry(bool is_cpu_on)
> +asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on)
>  {
>         struct psci_boot_args *boot_args;
>         struct kvm_cpu_context *host_ctxt;
> 
> 
> > > Also, is this with CONFIG_ARM_PSCI_CPUIDLE?
> > 
> > Yes, basically the cpus can enter cpu_suspend which IIUC pKVM traps and
> > handle for the host.
> 
> My current setup has no CONFIG_ARM_PSCI_CPUIDLE?, I will try to find
> something I can test with.
>

No worries, I can help until you find one.


> > > 
> > > > Not sure if this is something to do with the fact that pKVM skips to
> > > > __kvm_handle_stub_hvc in __host_hvc.
> > 
> > Sorry, my bad. I meant pKVM skips calling __kvm_handle_stub_hvc in __host_hvc
> > and jumps to __host_exit directly. Sorry for that, one wrong "to" changed the
> > whole meaning.
> 
> I don't see an issue in this, as this path has no indirect branches.
>

Understood.
Mostafa Saleh July 5, 2023, 3:56 p.m. UTC | #7
Hi Sudeep,

On Tue, Jul 04, 2023 at 08:25:29PM +0100, Sudeep Holla wrote:
> On Tue, Jul 04, 2023 at 04:27:04PM +0000, Mostafa Saleh wrote:
> > Hi Sudeep,
> > 
> > On Tue, Jul 04, 2023 at 03:33:39PM +0100, Sudeep Holla wrote:
> > > Hi Mostafa,
> > > 
> > > On Tue, Jul 04, 2023 at 02:18:09PM +0000, Mostafa Saleh wrote:
> > > > Hi Sudeep,
> > > > 
> > > > On Tue, Jul 04, 2023 at 02:41:36PM +0100, Sudeep Holla wrote:
> > > > > On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> > > > > > CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> > > > > > However, the nvhe code doesn't make use of it as it doesn't map any
> > > > > > pages with Guarded Page(GP) bit.
> > > > > > 
> > > > > > kvm pgtable code is modified to map executable pages with GP bit
> > > > > > if BTI is enabled for the kernel.
> > > > > > 
> > > > > > At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
> > > > > > (SCTLR_EL1.BT1) set in bti_enable().
> > > > > > 
> > > > > > One difference between kernel and nvhe code, is that the kernel maps
> > > > > > .text with GP while nvhe maps all the executable pages, this makes
> > > > > > nvhe code need to deal with special initialization code coming from
> > > > > > other executable sections (.idmap.text).
> > > > > > For this we need to add bti instruction at the beginning of
> > > > > > __kvm_handle_stub_hvc as it can be called by  __host_hvc through
> > > > > > branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
> > > > > > doesn’t add bti instruction at the beginning, and it can’t be modified
> > > > > > to add it as it is used with vector tables.
> > > > > > Another solution which is more intrusive is to convert
> > > > > > __kvm_handle_stub_hvc to a function and inject “bti jc” instead of
> > > > > > “bti c” in SYM_FUNC_START
> > > > > >
> > > > > 
> > > > > I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> > > > > cpuidle enabled. The system fails to boot. I just bisected the issue to this
> > > > > patch and also saw this patch landed in the linus tree yesterday/today.
> > > > 
> > > > One of the challenges of BTI is that we need to add explicit BTI instructions
> > > > for assembly code. I checked the code to make sure that nothing was missing,
> > > > but maybe this is not the case.
> > > > Can you please share more about the issue (is ESR a Branch Target Exception,
> > > > call stack...) if possible.
> > > 
> > > I haven't debugged it any further, just reported it as soon as I bisected it.
> > > Reverting this get back the booting system. I am not sure if anything is going
> > > wrong when the CPU is entering suspend(highly unlikely in normal scenario but
> > > I am not so sure with pKVM trapping these PSCI calls now) or when it is woken
> > > up and resuming back. IIUC this now will happen via kvm_hyp_cpu_resume->
> > > __kvm_hyp_init_cpu->___kvm_hyp_init. 
> > 
> > Thanks a lot for the information.
> > 
> > I checked this now, and I believe I found an issue. I see that __kvm_hyp_init_cpu
> > calls kvm_host_psci_cpu_entry indirectly and there is no BTI there.
> > I think this is the only C function that needs special handling.
> >
> 
> So it is in the wake up path. Thanks for the description, now I understand
> the issue and fix better.
> 
> > Can you please check if this solves the issue?
> >
> 
> Yes, the below patch fixed the issue. Feel free to add when you post the
> formal patch.
> 
> Reported-and-Tested-by: Sudeep Holla <sudeep.holla@arm.com>

Thanks for testing the patch, I will post it on the list.

> > diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
> > index c87c63133e10..7df63f364c3c 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/host.S
> > +++ b/arch/arm64/kvm/hyp/nvhe/host.S
> > @@ -297,3 +297,8 @@ SYM_CODE_START(__kvm_hyp_host_forward_smc)
> >  
> >         ret
> >  SYM_CODE_END(__kvm_hyp_host_forward_smc)
> > +
> > +SYM_CODE_START(kvm_host_psci_cpu_entry)
> > +       bti j
> > +       b __kvm_host_psci_cpu_entry
> > +SYM_CODE_END(kvm_host_psci_cpu_entry)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > index 08508783ec3d..24543d2a3490 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > @@ -200,7 +200,7 @@ static int psci_system_suspend(u64 func_id, struct kvm_cpu_context *host_ctxt)
> >                          __hyp_pa(init_params), 0);
> >  }
> >  
> > -asmlinkage void __noreturn kvm_host_psci_cpu_entry(bool is_cpu_on)
> > +asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on)
> >  {
> >         struct psci_boot_args *boot_args;
> >         struct kvm_cpu_context *host_ctxt;
> > 
> > 
> > > > Also, is this with CONFIG_ARM_PSCI_CPUIDLE?
> > > 
> > > Yes, basically the cpus can enter cpu_suspend which IIUC pKVM traps and
> > > handle for the host.
> > 
> > My current setup has no CONFIG_ARM_PSCI_CPUIDLE?, I will try to find
> > something I can test with.
> >
> 
> No worries, I can help until you find one.
> 
> 
> > > > 
> > > > > Not sure if this is something to do with the fact that pKVM skips to
> > > > > __kvm_handle_stub_hvc in __host_hvc.
> > > 
> > > Sorry, my bad. I meant pKVM skips calling __kvm_handle_stub_hvc in __host_hvc
> > > and jumps to __host_exit directly. Sorry for that, one wrong "to" changed the
> > > whole meaning.
> > 
> > I don't see an issue in this, as this path has no indirect branches.
> >
> 
> Understood.

Thanks,
Mostafa
Mostafa Saleh July 6, 2023, 12:49 p.m. UTC | #8
Hi Marc and Oliver,

I was double checking that nothing else was missed.

I found there is another problem for hw that has BTI and is affected
by specterv3a.

"br'' instructions are generated at runtime for the vector
table(__bp_harden_hyp_vecs).
These branches would land on vectors in __kvm_hyp_vector at offset 8.

As all the macros are defined with valid_vect/invalid_vect, it is
sufficient to add "bti j"
there at the correct offset.

I am not sure if such hardware exists. I tested this with a stubbed
"has_spectre_v3a" which
confirms the issue and the fix.

Please let me know if this fix suitable, I can include it with the other fix in
"[PATCH] KVM: arm64: Add missing BTI instruction in kvm_host_psci_cpu_entry"

diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index 8f3f93fa119e..175c030379e3 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -154,6 +154,12 @@ SYM_CODE_END(\label)
  esb
  stp x0, x1, [sp, #-16]!
 662:
+ /*
+ * Specter vectors __bp_harden_hyp_vecs generate br instructions at runtime
+ * that jump at offset 8 at __kvm_hyp_vector.
+ * As hyp .text is guarded section, it needs bti j.
+ */
+ bti j
  b \target

 check_preamble_length 661b, 662b
@@ -165,6 +171,8 @@ check_preamble_length 661b, 662b
  nop
  stp x0, x1, [sp, #-16]!
 662:
+ /* Check valid_vect */
+ bti j
  b \target

 check_preamble_length 661b, 662b

--
Thanks,
Mostafa




On Wed, Jul 5, 2023 at 4:56 PM Mostafa Saleh <smostafa@google.com> wrote:
>
> Hi Sudeep,
>
> On Tue, Jul 04, 2023 at 08:25:29PM +0100, Sudeep Holla wrote:
> > On Tue, Jul 04, 2023 at 04:27:04PM +0000, Mostafa Saleh wrote:
> > > Hi Sudeep,
> > >
> > > On Tue, Jul 04, 2023 at 03:33:39PM +0100, Sudeep Holla wrote:
> > > > Hi Mostafa,
> > > >
> > > > On Tue, Jul 04, 2023 at 02:18:09PM +0000, Mostafa Saleh wrote:
> > > > > Hi Sudeep,
> > > > >
> > > > > On Tue, Jul 04, 2023 at 02:41:36PM +0100, Sudeep Holla wrote:
> > > > > > On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> > > > > > > CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> > > > > > > However, the nvhe code doesn't make use of it as it doesn't map any
> > > > > > > pages with Guarded Page(GP) bit.
> > > > > > >
> > > > > > > kvm pgtable code is modified to map executable pages with GP bit
> > > > > > > if BTI is enabled for the kernel.
> > > > > > >
> > > > > > > At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration
> > > > > > > (SCTLR_EL1.BT1) set in bti_enable().
> > > > > > >
> > > > > > > One difference between kernel and nvhe code, is that the kernel maps
> > > > > > > .text with GP while nvhe maps all the executable pages, this makes
> > > > > > > nvhe code need to deal with special initialization code coming from
> > > > > > > other executable sections (.idmap.text).
> > > > > > > For this we need to add bti instruction at the beginning of
> > > > > > > __kvm_handle_stub_hvc as it can be called by  __host_hvc through
> > > > > > > branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START
> > > > > > > doesn’t add bti instruction at the beginning, and it can’t be modified
> > > > > > > to add it as it is used with vector tables.
> > > > > > > Another solution which is more intrusive is to convert
> > > > > > > __kvm_handle_stub_hvc to a function and inject “bti jc” instead of
> > > > > > > “bti c” in SYM_FUNC_START
> > > > > > >
> > > > > >
> > > > > > I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> > > > > > cpuidle enabled. The system fails to boot. I just bisected the issue to this
> > > > > > patch and also saw this patch landed in the linus tree yesterday/today.
> > > > >
> > > > > One of the challenges of BTI is that we need to add explicit BTI instructions
> > > > > for assembly code. I checked the code to make sure that nothing was missing,
> > > > > but maybe this is not the case.
> > > > > Can you please share more about the issue (is ESR a Branch Target Exception,
> > > > > call stack...) if possible.
> > > >
> > > > I haven't debugged it any further, just reported it as soon as I bisected it.
> > > > Reverting this get back the booting system. I am not sure if anything is going
> > > > wrong when the CPU is entering suspend(highly unlikely in normal scenario but
> > > > I am not so sure with pKVM trapping these PSCI calls now) or when it is woken
> > > > up and resuming back. IIUC this now will happen via kvm_hyp_cpu_resume->
> > > > __kvm_hyp_init_cpu->___kvm_hyp_init.
> > >
> > > Thanks a lot for the information.
> > >
> > > I checked this now, and I believe I found an issue. I see that __kvm_hyp_init_cpu
> > > calls kvm_host_psci_cpu_entry indirectly and there is no BTI there.
> > > I think this is the only C function that needs special handling.
> > >
> >
> > So it is in the wake up path. Thanks for the description, now I understand
> > the issue and fix better.
> >
> > > Can you please check if this solves the issue?
> > >
> >
> > Yes, the below patch fixed the issue. Feel free to add when you post the
> > formal patch.
> >
> > Reported-and-Tested-by: Sudeep Holla <sudeep.holla@arm.com>
>
> Thanks for testing the patch, I will post it on the list.
>
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
> > > index c87c63133e10..7df63f364c3c 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/host.S
> > > +++ b/arch/arm64/kvm/hyp/nvhe/host.S
> > > @@ -297,3 +297,8 @@ SYM_CODE_START(__kvm_hyp_host_forward_smc)
> > >
> > >         ret
> > >  SYM_CODE_END(__kvm_hyp_host_forward_smc)
> > > +
> > > +SYM_CODE_START(kvm_host_psci_cpu_entry)
> > > +       bti j
> > > +       b __kvm_host_psci_cpu_entry
> > > +SYM_CODE_END(kvm_host_psci_cpu_entry)
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/psci-relay.c b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > > index 08508783ec3d..24543d2a3490 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/psci-relay.c
> > > @@ -200,7 +200,7 @@ static int psci_system_suspend(u64 func_id, struct kvm_cpu_context *host_ctxt)
> > >                          __hyp_pa(init_params), 0);
> > >  }
> > >
> > > -asmlinkage void __noreturn kvm_host_psci_cpu_entry(bool is_cpu_on)
> > > +asmlinkage void __noreturn __kvm_host_psci_cpu_entry(bool is_cpu_on)
> > >  {
> > >         struct psci_boot_args *boot_args;
> > >         struct kvm_cpu_context *host_ctxt;
> > >
> > >
> > > > > Also, is this with CONFIG_ARM_PSCI_CPUIDLE?
> > > >
> > > > Yes, basically the cpus can enter cpu_suspend which IIUC pKVM traps and
> > > > handle for the host.
> > >
> > > My current setup has no CONFIG_ARM_PSCI_CPUIDLE?, I will try to find
> > > something I can test with.
> > >
> >
> > No worries, I can help until you find one.
> >
> >
> > > > >
> > > > > > Not sure if this is something to do with the fact that pKVM skips to
> > > > > > __kvm_handle_stub_hvc in __host_hvc.
> > > >
> > > > Sorry, my bad. I meant pKVM skips calling __kvm_handle_stub_hvc in __host_hvc
> > > > and jumps to __host_exit directly. Sorry for that, one wrong "to" changed the
> > > > whole meaning.
> > >
> > > I don't see an issue in this, as this path has no indirect branches.
> > >
> >
> > Understood.
>
> Thanks,
> Mostafa
>
Marc Zyngier July 6, 2023, 2:27 p.m. UTC | #9
Hi Mostafa,

On Thu, 06 Jul 2023 13:49:04 +0100,
Mostafa Saleh <smostafa@google.com> wrote:
> 
> Hi Marc and Oliver,
> 
> I was double checking that nothing else was missed.
> 
> I found there is another problem for hw that has BTI and is affected
> by specterv3a.
> 
> "br'' instructions are generated at runtime for the vector
> table(__bp_harden_hyp_vecs).  These branches would land on vectors
> in __kvm_hyp_vector at offset 8.
> 
> As all the macros are defined with valid_vect/invalid_vect, it is
> sufficient to add "bti j" there at the correct offset.
>
> I am not sure if such hardware exists. I tested this with a stubbed
> "has_spectre_v3a" which confirms the issue and the fix.

Thanks for the heads up.

Fortunately, there is no such HW as far as I can tell. Only Cortex-A57
and A72 are affected by this (and the only two CPUs for which we
engage the mitigation), and they are way too old to know about BTI.

> Please let me know if this fix suitable, I can include it with the other fix in
> "[PATCH] KVM: arm64: Add missing BTI instruction in kvm_host_psci_cpu_entry"
> 
> diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
> index 8f3f93fa119e..175c030379e3 100644
> --- a/arch/arm64/kvm/hyp/hyp-entry.S
> +++ b/arch/arm64/kvm/hyp/hyp-entry.S
> @@ -154,6 +154,12 @@ SYM_CODE_END(\label)
>   esb
>   stp x0, x1, [sp, #-16]!
>  662:
> + /*
> + * Specter vectors __bp_harden_hyp_vecs generate br instructions at runtime
> + * that jump at offset 8 at __kvm_hyp_vector.
> + * As hyp .text is guarded section, it needs bti j.
> + */
> + bti j
>   b \target
> 
>  check_preamble_length 661b, 662b
> @@ -165,6 +171,8 @@ check_preamble_length 661b, 662b
>   nop
>   stp x0, x1, [sp, #-16]!
>  662:
> + /* Check valid_vect */
> + bti j
>   b \target
> 
>  check_preamble_length 661b, 662b

This looks correct to me.

If you can respin you initial patch (with maybe a slightly more
generic subject) so that Oliver can pick it up as part of the next
batch of fixes, that'd be great.

Thanks,

	M.
Mostafa Saleh July 6, 2023, 3:23 p.m. UTC | #10
On Thu, Jul 06, 2023 at 03:27:36PM +0100, Marc Zyngier wrote:
> Hi Mostafa,
> 
> On Thu, 06 Jul 2023 13:49:04 +0100,
> Mostafa Saleh <smostafa@google.com> wrote:
> > 
> > Hi Marc and Oliver,
> > 
> > I was double checking that nothing else was missed.
> > 
> > I found there is another problem for hw that has BTI and is affected
> > by specterv3a.
> > 
> > "br'' instructions are generated at runtime for the vector
> > table(__bp_harden_hyp_vecs).  These branches would land on vectors
> > in __kvm_hyp_vector at offset 8.
> > 
> > As all the macros are defined with valid_vect/invalid_vect, it is
> > sufficient to add "bti j" there at the correct offset.
> >
> > I am not sure if such hardware exists. I tested this with a stubbed
> > "has_spectre_v3a" which confirms the issue and the fix.
> 
> Thanks for the heads up.
> 
> Fortunately, there is no such HW as far as I can tell. Only Cortex-A57
> and A72 are affected by this (and the only two CPUs for which we
> engage the mitigation), and they are way too old to know about BTI.
> 
> > Please let me know if this fix suitable, I can include it with the other fix in
> > "[PATCH] KVM: arm64: Add missing BTI instruction in kvm_host_psci_cpu_entry"
> > 
> > diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
> > index 8f3f93fa119e..175c030379e3 100644
> > --- a/arch/arm64/kvm/hyp/hyp-entry.S
> > +++ b/arch/arm64/kvm/hyp/hyp-entry.S
> > @@ -154,6 +154,12 @@ SYM_CODE_END(\label)
> >   esb
> >   stp x0, x1, [sp, #-16]!
> >  662:
> > + /*
> > + * Specter vectors __bp_harden_hyp_vecs generate br instructions at runtime
> > + * that jump at offset 8 at __kvm_hyp_vector.
> > + * As hyp .text is guarded section, it needs bti j.
> > + */
> > + bti j
> >   b \target
> > 
> >  check_preamble_length 661b, 662b
> > @@ -165,6 +171,8 @@ check_preamble_length 661b, 662b
> >   nop
> >   stp x0, x1, [sp, #-16]!
> >  662:
> > + /* Check valid_vect */
> > + bti j
> >   b \target
> > 
> >  check_preamble_length 661b, 662b
> 
> This looks correct to me.
> 
> If you can respin you initial patch (with maybe a slightly more
> generic subject) so that Oliver can pick it up as part of the next
> batch of fixes, that'd be great.

Thanks a lot, I just sent v2.
[PATCH v2] KVM: arm64: Add missing BTI instructions

> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.
Thorsten Leemhuis July 12, 2023, 10:34 a.m. UTC | #11
[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 04.07.23 15:41, Sudeep Holla wrote:
> On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
>> CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
>> However, the nvhe code doesn't make use of it as it doesn't map any
>> pages with Guarded Page(GP) bit.
> [...]
> I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> cpuidle enabled. The system fails to boot. I just bisected the issue to this
> patch and also saw this patch landed in the linus tree yesterday/today.
> Not sure if this is something to do with the fact that pKVM skips to
> __kvm_handle_stub_hvc in __host_hvc.
> 
> Let me know if you want be to try something.

Thanks for the report. Seems the fix is slow to progress. Hence to be
sure the issue doesn't fall through the cracks unnoticed, I'm adding it
to regzbot, the Linux kernel regression tracking bot:

#regzbot ^introduced b53d4a272349
#regzbot title KVM: arm64: boot failures due to missing bti instructions
#regzbot monitor:
https://lore.kernel.org/all/20230706152240.685684-1-smostafa@google.com/
#regzbot fix: KVM: arm64: Add missing BTI instructions
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
Marc Zyngier July 12, 2023, 10:44 a.m. UTC | #12
On Wed, 12 Jul 2023 11:34:51 +0100,
"Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
> 
> [CCing the regression list, as it should be in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
> 
> [TLDR: I'm adding this report to the list of tracked Linux kernel
> regressions; the text you find below is based on a few templates
> paragraphs you might have encountered already in similar form.
> See link in footer if these mails annoy you.]
> 
> On 04.07.23 15:41, Sudeep Holla wrote:
> > On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> >> CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> >> However, the nvhe code doesn't make use of it as it doesn't map any
> >> pages with Guarded Page(GP) bit.
> > [...]
> > I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> > cpuidle enabled. The system fails to boot. I just bisected the issue to this
> > patch and also saw this patch landed in the linus tree yesterday/today.
> > Not sure if this is something to do with the fact that pKVM skips to
> > __kvm_handle_stub_hvc in __host_hvc.
> > 
> > Let me know if you want be to try something.
> 
> Thanks for the report. Seems the fix is slow to progress.

It's not. See [1].

	M.

[1] https://lore.kernel.org/r/20230706152240.685684-1-smostafa@google.com
Thorsten Leemhuis July 12, 2023, 10:52 a.m. UTC | #13
On 12.07.23 12:44, Marc Zyngier wrote:
> On Wed, 12 Jul 2023 11:34:51 +0100,
> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
>>
>> [CCing the regression list, as it should be in the loop for regressions:
>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>>
>> [TLDR: I'm adding this report to the list of tracked Linux kernel
>> regressions; the text you find below is based on a few templates
>> paragraphs you might have encountered already in similar form.
>> See link in footer if these mails annoy you.]
>>
>> On 04.07.23 15:41, Sudeep Holla wrote:
>>> On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
>>>> CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
>>>> However, the nvhe code doesn't make use of it as it doesn't map any
>>>> pages with Guarded Page(GP) bit.
>>> [...]
>>> I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
>>> cpuidle enabled. The system fails to boot. I just bisected the issue to this
>>> patch and also saw this patch landed in the linus tree yesterday/today.
>>> Not sure if this is something to do with the fact that pKVM skips to
>>> __kvm_handle_stub_hvc in __host_hvc.
>>>
>>> Let me know if you want be to try something.
>>
>> Thanks for the report. Seems the fix is slow to progress.
> 
> It's not. See [1].
> 
> [1] https://lore.kernel.org/r/20230706152240.685684-1-smostafa@google.com

I'm aware of that fix, as one of the regzbot commands in the mail your
quoted pointed to that mail. But unless I'm missing something that fix
is now nearly a week old and not yet in -next. That from my point of
view makes it "slow to progress" and trackworthy.

Ciao, Thorsten
Marc Zyngier July 12, 2023, 11:01 a.m. UTC | #14
On Wed, 12 Jul 2023 11:52:39 +0100,
"Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
> 
> On 12.07.23 12:44, Marc Zyngier wrote:
> > On Wed, 12 Jul 2023 11:34:51 +0100,
> > "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
> >>
> >> [CCing the regression list, as it should be in the loop for regressions:
> >> https://docs.kernel.org/admin-guide/reporting-regressions.html]
> >>
> >> [TLDR: I'm adding this report to the list of tracked Linux kernel
> >> regressions; the text you find below is based on a few templates
> >> paragraphs you might have encountered already in similar form.
> >> See link in footer if these mails annoy you.]
> >>
> >> On 04.07.23 15:41, Sudeep Holla wrote:
> >>> On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
> >>>> CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
> >>>> However, the nvhe code doesn't make use of it as it doesn't map any
> >>>> pages with Guarded Page(GP) bit.
> >>> [...]
> >>> I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
> >>> cpuidle enabled. The system fails to boot. I just bisected the issue to this
> >>> patch and also saw this patch landed in the linus tree yesterday/today.
> >>> Not sure if this is something to do with the fact that pKVM skips to
> >>> __kvm_handle_stub_hvc in __host_hvc.
> >>>
> >>> Let me know if you want be to try something.
> >>
> >> Thanks for the report. Seems the fix is slow to progress.
> > 
> > It's not. See [1].
> > 
> > [1] https://lore.kernel.org/r/20230706152240.685684-1-smostafa@google.com
> 
> I'm aware of that fix, as one of the regzbot commands in the mail your
> quoted pointed to that mail. But unless I'm missing something that fix
> is now nearly a week old and not yet in -next. That from my point of
> view makes it "slow to progress" and trackworthy.

Shoving stuff in -next early is not a guarantee of the fix being
correct. Oddly enough, some of us value taking the time it takes to
make sure the fix is correct and addresses the *full* issue, not only
the reported corner case.

I know, this is not trendy. Too bad. On top of that, we don't push
fixes to -next either, so good luck tracking that.

	M.
Thorsten Leemhuis July 12, 2023, 11:16 a.m. UTC | #15
On 12.07.23 13:01, Marc Zyngier wrote:
> On Wed, 12 Jul 2023 11:52:39 +0100,
> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
>>
>> On 12.07.23 12:44, Marc Zyngier wrote:
>>> On Wed, 12 Jul 2023 11:34:51 +0100,
>>> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
>>>>
>>>> [CCing the regression list, as it should be in the loop for regressions:
>>>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>>>>
>>>> [TLDR: I'm adding this report to the list of tracked Linux kernel
>>>> regressions; the text you find below is based on a few templates
>>>> paragraphs you might have encountered already in similar form.
>>>> See link in footer if these mails annoy you.]
>>>>
>>>> On 04.07.23 15:41, Sudeep Holla wrote:
>>>>> On Tue, May 30, 2023 at 03:08:45PM +0000, Mostafa Saleh wrote:
>>>>>> CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI.
>>>>>> However, the nvhe code doesn't make use of it as it doesn't map any
>>>>>> pages with Guarded Page(GP) bit.
>>>>> [...]
>>>>> I was chasing a bug in linux-next yesterday with protected nVHE(pKVM) and
>>>>> cpuidle enabled. The system fails to boot. I just bisected the issue to this
>>>>> patch and also saw this patch landed in the linus tree yesterday/today.
>>>>> Not sure if this is something to do with the fact that pKVM skips to
>>>>> __kvm_handle_stub_hvc in __host_hvc.
>>>>>
>>>>> Let me know if you want be to try something.
>>>>
>>>> Thanks for the report. Seems the fix is slow to progress.
>>>
>>> It's not. See [1].
>>>
>>> [1] https://lore.kernel.org/r/20230706152240.685684-1-smostafa@google.com
>>
>> I'm aware of that fix, as one of the regzbot commands in the mail your
>> quoted pointed to that mail. But unless I'm missing something that fix
>> is now nearly a week old and not yet in -next. That from my point of
>> view makes it "slow to progress" and trackworthy.
> 
> Shoving stuff in -next early is not a guarantee of the fix being
> correct. Oddly enough, some of us value taking the time it takes to
> make sure the fix is correct and addresses the *full* issue, not only
> the reported corner case.

All that is totally fine with me. Sorry, we got off on the wrong foot,
because I didn't choose my words carefully. Apologies. All I wanted to
express was: "I saw a bug report and a fix for it; I wouldn't have added
this to the tracking if that fix was already heading towards mainline".

Ciao, Thorsten
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index e72d9aaab6b1..204124ce86c4 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -558,6 +558,7 @@ 
 			 (BIT(18)) | (BIT(22)) | (BIT(23)) | (BIT(28)) | \
 			 (BIT(29)))
 
+#define SCTLR_EL2_BT	(BIT(36))
 #ifdef CONFIG_CPU_BIG_ENDIAN
 #define ENDIAN_SET_EL2		SCTLR_ELx_EE
 #else
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index a6d67c2bb5ae..f3ee66aa2f9d 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -128,6 +128,13 @@  alternative_if ARM64_HAS_ADDRESS_AUTH
 		     SCTLR_ELx_ENDA | SCTLR_ELx_ENDB)
 	orr	x0, x0, x1
 alternative_else_nop_endif
+
+#ifdef CONFIG_ARM64_BTI_KERNEL
+alternative_if ARM64_BTI
+	orr	x0, x0, #SCTLR_EL2_BT
+alternative_else_nop_endif
+#endif /* CONFIG_ARM64_BTI_KERNEL */
+
 	msr	sctlr_el2, x0
 	isb
 
@@ -196,6 +203,11 @@  SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
 SYM_CODE_END(__kvm_hyp_init_cpu)
 
 SYM_CODE_START(__kvm_handle_stub_hvc)
+	/*
+	 * __kvm_handle_stub_hvc called from __host_hvc through branch instruction(br) so
+	 * we need bti j at beginning.
+	 */
+	bti j
 	cmp	x0, #HVC_SOFT_RESTART
 	b.ne	1f
 
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3d61bd3e591d..a79a45fc4047 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -34,7 +34,7 @@ 
 #define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS	3
 #define KVM_PTE_LEAF_ATTR_LO_S2_AF	BIT(10)
 
-#define KVM_PTE_LEAF_ATTR_HI		GENMASK(63, 51)
+#define KVM_PTE_LEAF_ATTR_HI		GENMASK(63, 50)
 
 #define KVM_PTE_LEAF_ATTR_HI_SW		GENMASK(58, 55)
 
@@ -42,6 +42,8 @@ 
 
 #define KVM_PTE_LEAF_ATTR_HI_S2_XN	BIT(54)
 
+#define KVM_PTE_LEAF_ATTR_HI_S1_GP	BIT(50)
+
 #define KVM_PTE_LEAF_ATTR_S2_PERMS	(KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R | \
 					 KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W | \
 					 KVM_PTE_LEAF_ATTR_HI_S2_XN)
@@ -371,6 +373,9 @@  static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep)
 
 		if (device)
 			return -EINVAL;
+
+		if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) && system_supports_bti())
+			attr |= KVM_PTE_LEAF_ATTR_HI_S1_GP;
 	} else {
 		attr |= KVM_PTE_LEAF_ATTR_HI_S1_XN;
 	}