Message ID | 20211002122026.1451269-1-mick@ics.forth.gr (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | riscv: Don't use va_pa_offset on kdump | expand |
On Sat, Oct 2, 2021 at 2:23 PM Nick Kossifidis <mick@ics.forth.gr> wrote: > > On kdump instead of using an intermediate step to relocate the kernel, that > lives in a "control buffer" outside the current kernel's mapping, we jump > to the crash kernel directly by calling riscv_kexec_norelocate(). The > current implementation uses va_pa_offset while switching to physical > addressing, however since we moved the kernel outside the linear mapping > this won't work anymore since riscv_kexec_norelocate() is part of the > kernel mapping and we should use kernel_map.va_kernel_pa_offset, and also > take XIP kernel into account. > > We don't really need to use va_pa_offset on riscv_kexec_norelocate, we can > just set STVEC to the physical address of the new kernel instead and let > the hart jump to the new kernel on the next instruction after setting > SATP to zero. This fixes kdump and is also simpler/cleaner. > > Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> > --- > arch/riscv/kernel/kexec_relocate.S | 15 +++++---------- > 1 file changed, 5 insertions(+), 10 deletions(-) > > diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S > index a80b52a74..e2f34196e 100644 > --- a/arch/riscv/kernel/kexec_relocate.S > +++ b/arch/riscv/kernel/kexec_relocate.S > @@ -159,25 +159,15 @@ SYM_CODE_START(riscv_kexec_norelocate) > * s0: (const) Phys address to jump to > * s1: (const) Phys address of the FDT image > * s2: (const) The hartid of the current hart > - * s3: (const) kernel_map.va_pa_offset, used when switching MMU off > */ > mv s0, a1 > mv s1, a2 > mv s2, a3 > - mv s3, a4 > > /* Disable / cleanup interrupts */ > csrw CSR_SIE, zero > csrw CSR_SIP, zero > > - /* Switch to physical addressing */ > - la s4, 1f > - sub s4, s4, s3 > - csrw CSR_STVEC, s4 > - csrw CSR_SATP, zero > - > -.align 2 > -1: > /* Pass the arguments to the next kernel / Cleanup*/ > mv a0, s2 > mv a1, s1 > @@ -214,6 +204,11 @@ SYM_CODE_START(riscv_kexec_norelocate) > csrw CSR_SCAUSE, zero > csrw CSR_SSCRATCH, zero > > + /* Switch to physical addressing */ > + csrw CSR_STVEC, a2 > + csrw CSR_SATP, zero > + > + /* This will trigger a jump to CSR_STVEC anyway */ > jalr zero, a2, 0 The last jump to a2 can be removed since the fault will be triggered before even reaching this instruction. > SYM_CODE_END(riscv_kexec_norelocate) > > -- > 2.32.0 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv This patch fixes a regression introduced when moving the kernel to the end of the address space, so we should add: Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear mapping") And it should be backported to 5.13 and 5.14. It seems that the following tags should be enough: Cc: <stable@vger.kernel.org> # 5.13 Cc: <stable@vger.kernel.org> # 5.14 And finally, you can add: Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Thanks, Alex
Στις 2021-10-06 14:13, Alexandre Ghiti έγραψε: >> + >> + /* This will trigger a jump to CSR_STVEC anyway */ >> jalr zero, a2, 0 > > The last jump to a2 can be removed since the fault will be triggered > before even reaching this instruction. > Just switching SATP to zero doesn't generate a trap unless mstatus.TVM is set (for visualization purposes). The hart will try and execute the next instruction but it's not clear in the spec what happens in case the code is cached, I don't want to rely solely on STVEC. I prefer having this instruction there, note that some earlier QEMU versions also had this behavior (the original kdump patch didn't set STVEC and it worked fine after setting SATP to zero). > > This patch fixes a regression introduced when moving the kernel to the > end of the address space, so we should add: > Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear > mapping") > > And it should be backported to 5.13 and 5.14. It seems that the > following tags should be enough: > > Cc: <stable@vger.kernel.org> # 5.13 > Cc: <stable@vger.kernel.org> # 5.14 > > And finally, you can add: > > Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> > ACK, thanks ! I'll resend the patch with the tags you mentioned. Regards, Nick
On Sat, 09 Oct 2021 06:18:48 PDT (-0700), mick@ics.forth.gr wrote: > Στις 2021-10-06 14:13, Alexandre Ghiti έγραψε: >>> + >>> + /* This will trigger a jump to CSR_STVEC anyway */ >>> jalr zero, a2, 0 >> >> The last jump to a2 can be removed since the fault will be triggered >> before even reaching this instruction. >> > > Just switching SATP to zero doesn't generate a trap unless mstatus.TVM > is set (for visualization purposes). The hart will try and execute the > next instruction but it's not clear in the spec what happens in case the > code is cached, I don't want to rely solely on STVEC. I prefer having > this instruction there, note that some earlier QEMU versions also had > this behavior (the original kdump patch didn't set STVEC and it worked > fine after setting SATP to zero). IIRC this came down to some very specific wording in the spec. Something along the lines of the 0 in SATP meaning "no translation", SFENCE.VMA ordering translations, and the general "if the spec doesn't mention it then it has to work" logic. I thought I opened a spec issue about this for clarification, but I can't find it. That said, I'm perfectly fine taking the safe approach here as it's not like the performance matters here. Warrants a comment, though. > >> >> This patch fixes a regression introduced when moving the kernel to the >> end of the address space, so we should add: >> Fixes: 2bfc6cd81bd1 ("riscv: Move kernel mapping outside of linear >> mapping") >> >> And it should be backported to 5.13 and 5.14. It seems that the >> following tags should be enough: >> >> Cc: <stable@vger.kernel.org> # 5.13 >> Cc: <stable@vger.kernel.org> # 5.14 >> >> And finally, you can add: >> >> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> >> > > ACK, thanks ! I'll resend the patch with the tags you mentioned. I don't have a v2 in my inbox, did I miss something? Also, if it's just the tags then it's generally not necessary to re-send something. The comment does, though. LMK if you want me to deal with this, or if there's going to be a v2. Thanks!
Στις 2021-10-23 23:14, Palmer Dabbelt έγραψε: > On Sat, 09 Oct 2021 06:18:48 PDT (-0700), mick@ics.forth.gr wrote: >> Στις 2021-10-06 14:13, Alexandre Ghiti έγραψε: >>>> + >>>> + /* This will trigger a jump to CSR_STVEC anyway */ >>>> jalr zero, a2, 0 >>> >>> The last jump to a2 can be removed since the fault will be triggered >>> before even reaching this instruction. >>> >> >> Just switching SATP to zero doesn't generate a trap unless mstatus.TVM >> is set (for visualization purposes). The hart will try and execute the >> next instruction but it's not clear in the spec what happens in case >> the >> code is cached, I don't want to rely solely on STVEC. I prefer having >> this instruction there, note that some earlier QEMU versions also had >> this behavior (the original kdump patch didn't set STVEC and it worked >> fine after setting SATP to zero). > > IIRC this came down to some very specific wording in the spec. > Something along the lines of the 0 in SATP meaning "no translation", > SFENCE.VMA ordering translations, and the general "if the spec doesn't > mention it then it has to work" logic. I thought I opened a spec > issue about this for clarification, but I can't find it. > I guess you mean this one: https://github.com/riscv/riscv-isa-manual/issues/538 I couldn't find anything though regarding cached code, it's not that there's going to be a load after setting satp to 0 if the code has been cached, so even if the translation is cached we don't have a guarantee that the next instruction will result a trap. > That said, I'm perfectly fine taking the safe approach here as it's > not like the performance matters here. Warrants a comment, though. > ACK > > I don't have a v2 in my inbox, did I miss something? Also, if it's > just the tags then it's generally not necessary to re-send something. > The comment does, though. > > LMK if you want me to deal with this, or if there's going to be a v2. > > Thanks! I'll send a v2 with the tags and the comment. Regards, Nick
diff --git a/arch/riscv/kernel/kexec_relocate.S b/arch/riscv/kernel/kexec_relocate.S index a80b52a74..e2f34196e 100644 --- a/arch/riscv/kernel/kexec_relocate.S +++ b/arch/riscv/kernel/kexec_relocate.S @@ -159,25 +159,15 @@ SYM_CODE_START(riscv_kexec_norelocate) * s0: (const) Phys address to jump to * s1: (const) Phys address of the FDT image * s2: (const) The hartid of the current hart - * s3: (const) kernel_map.va_pa_offset, used when switching MMU off */ mv s0, a1 mv s1, a2 mv s2, a3 - mv s3, a4 /* Disable / cleanup interrupts */ csrw CSR_SIE, zero csrw CSR_SIP, zero - /* Switch to physical addressing */ - la s4, 1f - sub s4, s4, s3 - csrw CSR_STVEC, s4 - csrw CSR_SATP, zero - -.align 2 -1: /* Pass the arguments to the next kernel / Cleanup*/ mv a0, s2 mv a1, s1 @@ -214,6 +204,11 @@ SYM_CODE_START(riscv_kexec_norelocate) csrw CSR_SCAUSE, zero csrw CSR_SSCRATCH, zero + /* Switch to physical addressing */ + csrw CSR_STVEC, a2 + csrw CSR_SATP, zero + + /* This will trigger a jump to CSR_STVEC anyway */ jalr zero, a2, 0 SYM_CODE_END(riscv_kexec_norelocate)
On kdump instead of using an intermediate step to relocate the kernel, that lives in a "control buffer" outside the current kernel's mapping, we jump to the crash kernel directly by calling riscv_kexec_norelocate(). The current implementation uses va_pa_offset while switching to physical addressing, however since we moved the kernel outside the linear mapping this won't work anymore since riscv_kexec_norelocate() is part of the kernel mapping and we should use kernel_map.va_kernel_pa_offset, and also take XIP kernel into account. We don't really need to use va_pa_offset on riscv_kexec_norelocate, we can just set STVEC to the physical address of the new kernel instead and let the hart jump to the new kernel on the next instruction after setting SATP to zero. This fixes kdump and is also simpler/cleaner. Signed-off-by: Nick Kossifidis <mick@ics.forth.gr> --- arch/riscv/kernel/kexec_relocate.S | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-)