Message ID | 20240928155219.266866-1-cs.os.kernel@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | riscv: mm: check the SV39 rule | expand |
Hi Cheng, On 28/09/2024 17:52, Cheng Chao wrote: > SV39 rule: the address of bits[63..39] should be the same as bit[38], > it is easy to violate if configure PAGE_OFFSET too small. > for instance, PAGE_OFFSET=0xffffffc0_0000_0000, Out of curiosity, why do you try to modify the current memory layout? > the FIXADDR_START will be the 0xfffffffb0_xxxx_xxxx, > bit[39] == 0'b1, bit[38] == 0'b0, when access the FIXADDR, > which cause the page fault. > It's difficult to debug, check it when compile is necessary. > > Signed-off-by: Cheng Chao <cs.os.kernel@gmail.com> > --- > arch/riscv/mm/init.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index bfa2dea95354..a5fb3fc2b2db 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -119,7 +119,7 @@ static inline void print_mlt(char *name, unsigned long b, unsigned long t) > > static inline void print_ml(char *name, unsigned long b, unsigned long t) > { > - unsigned long diff = t - b; > + unsigned long diff = t - b + 1; > > if (IS_ENABLED(CONFIG_64BIT) && (diff >> LOG2_SZ_1T) >= 10) > print_mlt(name, b, t); > @@ -1387,6 +1387,12 @@ static void __init arch_reserve_crashkernel(void) > > void __init paging_init(void) > { > +#ifdef CONFIG_64BIT > + BUILD_BUG_ON_MSG((VA_BITS == VA_BITS_SV39) && VA_BITS is determined at runtime, either by setting "noXlvl" on the kernel command line or by probing the HW capabilities, so the above can't be determined at build time, or am I missing something? > + (((FIXADDR_START & BIT(39)) >> 39) > + != ((FIXADDR_START & BIT(38)) >> 38)), > + "violate SV39 rule: bits[63..39] should be same as bit[38]"); > +#endif > setup_bootmem(); > setup_vm_final(); > Thanks, Alex
Hi Alexandre, Thanks for reviewing this patch. I don't reply on time due to the National Day :) On Mon, Sep 30, 2024 at 3:46 PM Alexandre Ghiti <alex@ghiti.fr> wrote: > > Hi Cheng, > > On 28/09/2024 17:52, Cheng Chao wrote: > > SV39 rule: the address of bits[63..39] should be the same as bit[38], > > it is easy to violate if configure PAGE_OFFSET too small. > > for instance, PAGE_OFFSET=0xffffffc0_0000_0000, > > > Out of curiosity, why do you try to modify the current memory layout? It's a long story. I'm working on kernel-5.10, and the default PAGE_OFFSET 0xffffffe0_0000_0000 can only map the 128G physical address, and we need to map more than 128G in the real world. After I changed PAGE_OFFSET to a smaller value which can map a more physical address. the kernel panic(page fault) when earlycon printk, the cause is 0xf, badaddr is 0xffffffbx_xxxx_xxxx. It took me several days to debug this panic, at the beginning, I considered MMU doesn't work for earlycon, so I traced the pgd/pmd/pte, all are correct before page fault. Occasionally, I found the SV39 rule: the address of bits[63..39] should be the same as bit[38], the address 0xffffffbx_xxxx_xxxx violates the rule. The root cause is that the kernel doesn't sense the SV39 rule, so when I adjust the PAGE_OFFSET, the FIXADDR_START will adjust too without any check. when accessing the [FIXADDR, FIXADDR + PMD_SIZE] , the kernel will have a page fault. > > > > the FIXADDR_START will be the 0xfffffffb0_xxxx_xxxx, > > bit[39] == 0'b1, bit[38] == 0'b0, when access the FIXADDR, > > which cause the page fault. > > It's difficult to debug, check it when compile is necessary. > > > > Signed-off-by: Cheng Chao <cs.os.kernel@gmail.com> > > --- > > arch/riscv/mm/init.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > > index bfa2dea95354..a5fb3fc2b2db 100644 > > --- a/arch/riscv/mm/init.c > > +++ b/arch/riscv/mm/init.c > > @@ -1387,6 +1387,12 @@ static void __init arch_reserve_crashkernel(void) > > > > void __init paging_init(void) > > { > > +#ifdef CONFIG_64BIT > > + BUILD_BUG_ON_MSG((VA_BITS == VA_BITS_SV39) && > > > VA_BITS is determined at runtime, either by setting "noXlvl" on the > kernel command line or by probing the HW capabilities, so the above > can't be determined at build time, or am I missing something? > you are right, in case of CONFIG_XIP_KERNEL=y, VA_BITS will be VA_BITS_SV39, this patch works. other cases, disable_pgtable_l4 will enable SATP_MODE_39, we also check SV39 rule here? @@ -763,6 +763,11 @@ static void __init disable_pgtable_l4(void) pgtable_l4_enabled = false; kernel_map.page_offset = PAGE_OFFSET_L3; satp_mode = SATP_MODE_39; + + if ((VA_BITS == VA_BITS_SV39) && + (((FIXADDR_START & BIT(39)) >> 39) + != ((FIXADDR_START & BIT(38)) >> 38))) + panic("violate SV39 rule: bits[63..39] should be same as bit[38]\n"); } > > > + (((FIXADDR_START & BIT(39)) >> 39) > > + != ((FIXADDR_START & BIT(38)) >> 38)), > > + "violate SV39 rule: bits[63..39] should be same as bit[38]"); > > +#endif > > setup_bootmem(); > > setup_vm_final(); > > > > > Thanks, > > Alex > Thanks, Cheng Chao
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index bfa2dea95354..a5fb3fc2b2db 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -119,7 +119,7 @@ static inline void print_mlt(char *name, unsigned long b, unsigned long t) static inline void print_ml(char *name, unsigned long b, unsigned long t) { - unsigned long diff = t - b; + unsigned long diff = t - b + 1; if (IS_ENABLED(CONFIG_64BIT) && (diff >> LOG2_SZ_1T) >= 10) print_mlt(name, b, t); @@ -1387,6 +1387,12 @@ static void __init arch_reserve_crashkernel(void) void __init paging_init(void) { +#ifdef CONFIG_64BIT + BUILD_BUG_ON_MSG((VA_BITS == VA_BITS_SV39) && + (((FIXADDR_START & BIT(39)) >> 39) + != ((FIXADDR_START & BIT(38)) >> 38)), + "violate SV39 rule: bits[63..39] should be same as bit[38]"); +#endif setup_bootmem(); setup_vm_final();
SV39 rule: the address of bits[63..39] should be the same as bit[38], it is easy to violate if configure PAGE_OFFSET too small. for instance, PAGE_OFFSET=0xffffffc0_0000_0000, the FIXADDR_START will be the 0xfffffffb0_xxxx_xxxx, bit[39] == 0'b1, bit[38] == 0'b0, when access the FIXADDR, which cause the page fault. It's difficult to debug, check it when compile is necessary. Signed-off-by: Cheng Chao <cs.os.kernel@gmail.com> --- arch/riscv/mm/init.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)