diff mbox series

[1/3] arm64: clean up trampoline vector loads

Message ID 20200319091407.51449-1-remi@remlab.net (mailing list archive)
State New, archived
Headers show
Series clean up KPTI / SDEI trampoline data alignment | expand

Commit Message

Rémi Denis-Courmont March 19, 2020, 9:14 a.m. UTC
From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>

This switches from custom instruction patterns to the regular large
memory model sequence with ADRP and LDR. In doing so, the ADD
instruction can be eliminated in the SDEI handler, and the code no
longer assumes that the trampoline vectors and the vectors address both
start on a page boundary.

Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
---
 arch/arm64/kernel/entry.S | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Mark Rutland March 23, 2020, 12:07 p.m. UTC | #1
On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> 
> This switches from custom instruction patterns to the regular large
> memory model sequence with ADRP and LDR. In doing so, the ADD
> instruction can be eliminated in the SDEI handler, and the code no
> longer assumes that the trampoline vectors and the vectors address both
> start on a page boundary.
> 
> Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> ---
>  arch/arm64/kernel/entry.S | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index e5d4e30ee242..24f828739696 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -805,9 +805,9 @@ alternative_else_nop_endif
>  2:
>  	tramp_map_kernel	x30
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adr	x30, tramp_vectors + PAGE_SIZE
> +	adrp	x30, tramp_vectors + PAGE_SIZE
>  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> -	ldr	x30, [x30]
> +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]

I think this is busted for !4K kernels once we reduce the alignment of
__entry_tramp_data_start.

The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
I think that's what's causing the hang Catalin sees with 64K pages (and
would also be a problem for 16K pages).

Ideally, we'd account for those bits with the ADRP, but I'm not sure
that an ELF relocation can encode symbol + addr + symbol:15-12, so we
likely nned more instructions to explicitly mask that in.

... either that, or leave this page aligned.

>  #else
>  	ldr	x30, =vectors
>  #endif
> @@ -953,9 +953,8 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
>  1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
>  
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adr	x4, tramp_vectors + PAGE_SIZE
> -	add	x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
> -	ldr	x4, [x4]
> +	adrp	x4, tramp_vectors + PAGE_SIZE
> +	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]

Likewise here.

Thanks,
Mark.

>  #else
>  	ldr	x4, =__sdei_asm_handler
>  #endif
> -- 
> 2.26.0.rc2
>
Rémi Denis-Courmont March 23, 2020, 12:08 p.m. UTC | #2
Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > 
> > This switches from custom instruction patterns to the regular large
> > memory model sequence with ADRP and LDR. In doing so, the ADD
> > instruction can be eliminated in the SDEI handler, and the code no
> > longer assumes that the trampoline vectors and the vectors address both
> > start on a page boundary.
> > 
> > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > ---
> > 
> >  arch/arm64/kernel/entry.S | 9 ++++-----
> >  1 file changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index e5d4e30ee242..24f828739696 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > 
> >  2:
> >  	tramp_map_kernel	x30
> >  
> >  #ifdef CONFIG_RANDOMIZE_BASE
> > 
> > -	adr	x30, tramp_vectors + PAGE_SIZE
> > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > 
> >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > 
> > -	ldr	x30, [x30]
> > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> 
> I think this is busted for !4K kernels once we reduce the alignment of
> __entry_tramp_data_start.
> 
> The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> relocation gives us bits 11:0, so we haven't accounted for bits 15:12.

IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.

I rather suspect that the problem is with my C code diff assuming that 
PAGE_MASK is 4095.
Mark Rutland March 23, 2020, 12:14 p.m. UTC | #3
On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > 
> > > This switches from custom instruction patterns to the regular large
> > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > instruction can be eliminated in the SDEI handler, and the code no
> > > longer assumes that the trampoline vectors and the vectors address both
> > > start on a page boundary.
> > > 
> > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > ---
> > > 
> > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > index e5d4e30ee242..24f828739696 100644
> > > --- a/arch/arm64/kernel/entry.S
> > > +++ b/arch/arm64/kernel/entry.S
> > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > 
> > >  2:
> > >  	tramp_map_kernel	x30
> > >  
> > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > 
> > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > 
> > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > 
> > > -	ldr	x30, [x30]
> > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > 
> > I think this is busted for !4K kernels once we reduce the alignment of
> > __entry_tramp_data_start.
> > 
> > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> 
> IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.

Sorry, I had erroneously assumed tramp_vectors was page aligned. The
issue still stands -- we haven't accounted for bits 15:12, as those can
differ between tramp_vectors and __entry_tramp_data_start.

Thanks,
Mark.
Catalin Marinas March 23, 2020, 7:04 p.m. UTC | #4
On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > 
> > > > This switches from custom instruction patterns to the regular large
> > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > longer assumes that the trampoline vectors and the vectors address both
> > > > start on a page boundary.
> > > > 
> > > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > ---
> > > > 
> > > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > index e5d4e30ee242..24f828739696 100644
> > > > --- a/arch/arm64/kernel/entry.S
> > > > +++ b/arch/arm64/kernel/entry.S
> > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > 
> > > >  2:
> > > >  	tramp_map_kernel	x30
> > > >  
> > > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > > 
> > > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > > 
> > > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > 
> > > > -	ldr	x30, [x30]
> > > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > > 
> > > I think this is busted for !4K kernels once we reduce the alignment of
> > > __entry_tramp_data_start.
> > > 
> > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The lo12
> > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > 
> > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> 
> Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> issue still stands -- we haven't accounted for bits 15:12, as those can
> differ between tramp_vectors and __entry_tramp_data_start.

Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
below doesn't solve the issue I'm seeing (only reverting patch 3).

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ca1340eb46d8..4cc9d1df3985 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -810,7 +810,7 @@ alternative_else_nop_endif
 2:
 	tramp_map_kernel	x30
 #ifdef CONFIG_RANDOMIZE_BASE
-	adrp	x30, tramp_vectors + PAGE_SIZE
+	adrp	x30, __entry_tramp_data_start
 alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
 	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
 #else
@@ -964,7 +964,7 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
 1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
 
 #ifdef CONFIG_RANDOMIZE_BASE
-	adrp	x4, tramp_vectors + PAGE_SIZE
+	adrp	x4, __sdei_asm_trampoline_next_handler
 	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
 #else
 	ldr	x4, =__sdei_asm_handler
Rémi Denis-Courmont March 23, 2020, 8:42 p.m. UTC | #5
Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> > On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > 
> > > > > This switches from custom instruction patterns to the regular large
> > > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > > longer assumes that the trampoline vectors and the vectors address
> > > > > both
> > > > > start on a page boundary.
> > > > > 
> > > > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > ---
> > > > > 
> > > > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > > > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > > > 
> > > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > > index e5d4e30ee242..24f828739696 100644
> > > > > --- a/arch/arm64/kernel/entry.S
> > > > > +++ b/arch/arm64/kernel/entry.S
> > > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > > 
> > > > >  2:
> > > > >  	tramp_map_kernel	x30
> > > > >  
> > > > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > > > 
> > > > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > > > 
> > > > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > > 
> > > > > -	ldr	x30, [x30]
> > > > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > > > 
> > > > I think this is busted for !4K kernels once we reduce the alignment of
> > > > __entry_tramp_data_start.
> > > > 
> > > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The
> > > > lo12
> > > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > > 
> > > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> > 
> > Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> > issue still stands -- we haven't accounted for bits 15:12, as those can
> > differ between tramp_vectors and __entry_tramp_data_start.

Does that mean that the SDEI code never worked with page size > 4 KiB?

> Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> below doesn't solve the issue I'm seeing (only reverting patch 3).

AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
in the main vmlinux does not match the architected offset inside the fixmap. If 
so, then using the symbol directly will not work at all.




> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index ca1340eb46d8..4cc9d1df3985 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -810,7 +810,7 @@ alternative_else_nop_endif
>  2:
>  	tramp_map_kernel	x30
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adrp	x30, tramp_vectors + PAGE_SIZE
> +	adrp	x30, __entry_tramp_data_start
>  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
>  	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
>  #else
> @@ -964,7 +964,7 @@ SYM_CODE_START(__sdei_asm_entry_trampoline)
>  1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
> 
>  #ifdef CONFIG_RANDOMIZE_BASE
> -	adrp	x4, tramp_vectors + PAGE_SIZE
> +	adrp	x4, __sdei_asm_trampoline_next_handler
>  	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
>  #else
>  	ldr	x4, =__sdei_asm_handler
Catalin Marinas March 24, 2020, 10:37 a.m. UTC | #6
On Mon, Mar 23, 2020 at 10:42:30PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > below doesn't solve the issue I'm seeing (only reverting patch 3).
> 
> AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
> in the main vmlinux does not match the architected offset inside the fixmap. If 
> so, then using the symbol directly will not work at all.

You are right, it broke the defconfig as well.
Mark Rutland March 24, 2020, 10:52 a.m. UTC | #7
On Mon, Mar 23, 2020 at 10:42:30PM +0200, Rémi Denis-Courmont wrote:
> Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> > On Mon, Mar 23, 2020 at 12:14:37PM +0000, Mark Rutland wrote:
> > > On Mon, Mar 23, 2020 at 02:08:53PM +0200, Rémi Denis-Courmont wrote:
> > > > Le maanantaina 23. maaliskuuta 2020, 14.07.00 EET Mark Rutland a écrit :
> > > > > On Thu, Mar 19, 2020 at 11:14:05AM +0200, Rémi Denis-Courmont wrote:
> > > > > > From: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > > 
> > > > > > This switches from custom instruction patterns to the regular large
> > > > > > memory model sequence with ADRP and LDR. In doing so, the ADD
> > > > > > instruction can be eliminated in the SDEI handler, and the code no
> > > > > > longer assumes that the trampoline vectors and the vectors address
> > > > > > both
> > > > > > start on a page boundary.
> > > > > > 
> > > > > > Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com>
> > > > > > ---
> > > > > > 
> > > > > >  arch/arm64/kernel/entry.S | 9 ++++-----
> > > > > >  1 file changed, 4 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > > > > index e5d4e30ee242..24f828739696 100644
> > > > > > --- a/arch/arm64/kernel/entry.S
> > > > > > +++ b/arch/arm64/kernel/entry.S
> > > > > > @@ -805,9 +805,9 @@ alternative_else_nop_endif
> > > > > > 
> > > > > >  2:
> > > > > >  	tramp_map_kernel	x30
> > > > > >  
> > > > > >  #ifdef CONFIG_RANDOMIZE_BASE
> > > > > > 
> > > > > > -	adr	x30, tramp_vectors + PAGE_SIZE
> > > > > > +	adrp	x30, tramp_vectors + PAGE_SIZE
> > > > > > 
> > > > > >  alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
> > > > > > 
> > > > > > -	ldr	x30, [x30]
> > > > > > +	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
> > > > > 
> > > > > I think this is busted for !4K kernels once we reduce the alignment of
> > > > > __entry_tramp_data_start.
> > > > > 
> > > > > The ADRP gives us a 64K aligned address (with bits 15:0 clear). The
> > > > > lo12
> > > > > relocation gives us bits 11:0, so we haven't accounted for bits 15:12.
> > > > 
> > > > IMU, ADRP gives a 4K aligned value, regardless of MMU (TCR) settings.
> > > 
> > > Sorry, I had erroneously assumed tramp_vectors was page aligned. The
> > > issue still stands -- we haven't accounted for bits 15:12, as those can
> > > differ between tramp_vectors and __entry_tramp_data_start.
> 
> Does that mean that the SDEI code never worked with page size > 4 KiB?

I think this happens to work, but is fragile. Because nothing happens to
get placed in .rodata between the _entry_tramp_data_start data and the
__sdei_asm_trampoline_next_handler data, the
__sdei_asm_trampoline_next_handler data doesn't spill into a separate
page from the _entry_tramp_data_start data.

If we did start adding stuff into .rodata between those two, there'd be
a bigger risk of things going wrong. That was why I suggested a
.entry.tramp.data section previously.

> > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > below doesn't solve the issue I'm seeing (only reverting patch 3).
> 
> AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
> in the main vmlinux does not match the architected offset inside the fixmap. If 
> so, then using the symbol directly will not work at all.

Indeed. I can't see a neat way of avoiding this right now, so should we
drop these patches and leave the code as-is (but with comments as to the
special requirements that it has)?

Thanks,
Mark.
Catalin Marinas March 24, 2020, 11:23 a.m. UTC | #8
On Tue, Mar 24, 2020 at 10:52:17AM +0000, Mark Rutland wrote:
> On Mon, Mar 23, 2020 at 10:42:30PM +0200, Rémi Denis-Courmont wrote:
> > Le maanantaina 23. maaliskuuta 2020, 21.04.09 EET Catalin Marinas a écrit :
> > > Should we just use adrp on __entry_tramp_data_start? Anyway, the diff
> > > below doesn't solve the issue I'm seeing (only reverting patch 3).
> > 
> > AFAIU, the preexisting code uses the manual PAGE_SIZE offset because the offset 
> > in the main vmlinux does not match the architected offset inside the fixmap. If 
> > so, then using the symbol directly will not work at all.
> 
> Indeed. I can't see a neat way of avoiding this right now, so should we
> drop these patches and leave the code as-is (but with comments as to the
> special requirements that it has)?

I'm going to drop these three patches from -next for now but I can take
any updated comments (they are pretty much missing from this code).

Thanks.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e5d4e30ee242..24f828739696 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -805,9 +805,9 @@  alternative_else_nop_endif
 2:
 	tramp_map_kernel	x30
 #ifdef CONFIG_RANDOMIZE_BASE
-	adr	x30, tramp_vectors + PAGE_SIZE
+	adrp	x30, tramp_vectors + PAGE_SIZE
 alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
-	ldr	x30, [x30]
+	ldr	x30, [x30, #:lo12:__entry_tramp_data_start]
 #else
 	ldr	x30, =vectors
 #endif
@@ -953,9 +953,8 @@  SYM_CODE_START(__sdei_asm_entry_trampoline)
 1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
 
 #ifdef CONFIG_RANDOMIZE_BASE
-	adr	x4, tramp_vectors + PAGE_SIZE
-	add	x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
-	ldr	x4, [x4]
+	adrp	x4, tramp_vectors + PAGE_SIZE
+	ldr	x4, [x4, #:lo12:__sdei_asm_trampoline_next_handler]
 #else
 	ldr	x4, =__sdei_asm_handler
 #endif