diff mbox series

RISC-V: mm: Support huge page in vmalloc_fault()

Message ID 20230224104001.2743135-1-dylan@andestech.com (mailing list archive)
State Superseded
Headers show
Series RISC-V: mm: Support huge page in vmalloc_fault() | expand

Checks

Context Check Description
conchuod/cover_letter success Single patches do not need cover letters
conchuod/tree_selection success Guessed tree name to be fixes
conchuod/fixes_present success Fixes tag present in non-next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 13 and now 13
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 3 this patch: 3
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 19 this patch: 19
conchuod/alphanumeric_selects success Out of order selects before the patch: 729 and now 729
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 2 this patch: 2
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch success total: 0 errors, 0 warnings, 0 checks, 23 lines checked
conchuod/source_inline success Was 0 now: 0
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes fail Problems with Fixes tag: 1
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Dylan Jhong Feb. 24, 2023, 10:40 a.m. UTC
RISC-V supports ioremap() with huge page (pud/pmd) mapping, but
vmalloc_fault() assumes that the vmalloc range is limited to pte
mappings. Add huge page support to complete the vmalloc_fault()
function.

Fixes: 310f541a027b ("riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT")

Signed-off-by: Dylan Jhong <dylan@andestech.com>
---
 arch/riscv/mm/fault.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Alexandre Ghiti Feb. 24, 2023, 12:47 p.m. UTC | #1
Hi Dylan,

On 2/24/23 11:40, Dylan Jhong wrote:
> RISC-V supports ioremap() with huge page (pud/pmd) mapping, but
> vmalloc_fault() assumes that the vmalloc range is limited to pte
> mappings. Add huge page support to complete the vmalloc_fault()
> function.
>
> Fixes: 310f541a027b ("riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT")
>
> Signed-off-by: Dylan Jhong <dylan@andestech.com>
> ---
>   arch/riscv/mm/fault.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index eb0774d9c03b..4b9953b47d81 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -143,6 +143,8 @@ static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
>   		no_context(regs, addr);
>   		return;
>   	}
> +	if (pud_leaf(*pud_k))
> +		goto flush_tlb;
>   
>   	/*
>   	 * Since the vmalloc area is global, it is unnecessary
> @@ -153,6 +155,8 @@ static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
>   		no_context(regs, addr);
>   		return;
>   	}
> +	if (pmd_leaf(*pmd_k))
> +		goto flush_tlb;
>   
>   	/*
>   	 * Make sure the actual PTE exists as well to
> @@ -172,6 +176,7 @@ static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
>   	 * ordering constraint, not a cache flush; it is
>   	 * necessary even after writing invalid entries.
>   	 */
> +flush_tlb:
>   	local_flush_tlb_page(addr);
>   }
>   


This looks good to me, you can add:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

One question: how did you encounter this bug?

Thanks,

Alex
Dylan Jhong March 1, 2023, 11:17 a.m. UTC | #2
On Fri, Feb 24, 2023 at 01:47:20PM +0100, Alexandre Ghiti wrote:
> Hi Dylan,
> 
> On 2/24/23 11:40, Dylan Jhong wrote:
> > RISC-V supports ioremap() with huge page (pud/pmd) mapping, but
> > vmalloc_fault() assumes that the vmalloc range is limited to pte
> > mappings. Add huge page support to complete the vmalloc_fault()
> > function.
> > 
> > Fixes: 310f541a027b ("riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT")
> > 
> > Signed-off-by: Dylan Jhong <dylan@andestech.com>
> > ---
> >   arch/riscv/mm/fault.c | 5 +++++
> >   1 file changed, 5 insertions(+)
> > 
> > diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> > index eb0774d9c03b..4b9953b47d81 100644
> > --- a/arch/riscv/mm/fault.c
> > +++ b/arch/riscv/mm/fault.c
> > @@ -143,6 +143,8 @@ static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
> >   		no_context(regs, addr);
> >   		return;
> >   	}
> > +	if (pud_leaf(*pud_k))
> > +		goto flush_tlb;
> >   	/*
> >   	 * Since the vmalloc area is global, it is unnecessary
> > @@ -153,6 +155,8 @@ static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
> >   		no_context(regs, addr);
> >   		return;
> >   	}
> > +	if (pmd_leaf(*pmd_k))
> > +		goto flush_tlb;
> >   	/*
> >   	 * Make sure the actual PTE exists as well to
> > @@ -172,6 +176,7 @@ static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
> >   	 * ordering constraint, not a cache flush; it is
> >   	 * necessary even after writing invalid entries.
> >   	 */
> > +flush_tlb:
> >   	local_flush_tlb_page(addr);
> >   }
> 
> 
> This looks good to me, you can add:
> 
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> 
> One question: how did you encounter this bug?
> 
> Thanks,
> 
> Alex
>
Hi Alex,

>>> One question: how did you encounter this bug?
This bug is caused by the combination of out-of-order excutiuon and ioremap().
The OoO excution will try to access the VA that is given by ioremap() and record
a page fault in TLB before the mapping is created in ioremap(). When the CPU
really accesses the VA after ioremap(), the CPU will trigger page fault because
of the TLB already has the VA mapping.

We hope that the vmalloc_fault() in page fault handler will trigger sfence.vma
to invalidate the TLB[1]. But since we do not support the huge page in vmalloc_fault(),
we encountered the nested page faults in vmalloc_fault() while forcing the pmd/pud
huge pages to resolve pte entry. This is the reason I send this patch.

ref:
    [1]: https://patchwork.kernel.org/project/linux-riscv/patch/20210412000531.12249-1-liu@jiuyang.me/
diff mbox series

Patch

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index eb0774d9c03b..4b9953b47d81 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -143,6 +143,8 @@  static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
 		no_context(regs, addr);
 		return;
 	}
+	if (pud_leaf(*pud_k))
+		goto flush_tlb;
 
 	/*
 	 * Since the vmalloc area is global, it is unnecessary
@@ -153,6 +155,8 @@  static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
 		no_context(regs, addr);
 		return;
 	}
+	if (pmd_leaf(*pmd_k))
+		goto flush_tlb;
 
 	/*
 	 * Make sure the actual PTE exists as well to
@@ -172,6 +176,7 @@  static inline void vmalloc_fault(struct pt_regs *regs, int code, unsigned long a
 	 * ordering constraint, not a cache flush; it is
 	 * necessary even after writing invalid entries.
 	 */
+flush_tlb:
 	local_flush_tlb_page(addr);
 }