Message ID | 20230703190044.311730-1-jhubbard@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: riscv: fix an unsafe pte read in huge_pte_alloc() | expand |
On Mon, Jul 03, 2023 at 12:00:44PM -0700, John Hubbard wrote: > The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible > to false positives, because the pte is read twice at the C language > level, locklessly, within the same conditional statement. Depending on > compiler behavior, this can lead to generated machine code that actually > reads the pte just once, or twice. Reading twice will expose the code to > changing pte values and cause incorrect behavior. > > In [1], similar code actually caused a kernel crash on 64-bit x86, when > using clang to build the kernel, but only after the conversion from *pte > reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a > double read of *pte. > > Rather than waiting for the upcoming ptep_get() conversion, just convert > this part of the code now, but in a way that avoids the above problem: > take a single snapshot of the pte before using it in the WARN > conditional. > > As expected, this preparatory step does not actually change the > generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc > 12.2 cross compiler. > > [1] https://lore.kernel.org/20230630013203.1955064-1-jhubbard@nvidia.com > > Suggested-by: James Houghton <jthoughton@google.com> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Signed-off-by: John Hubbard <jhubbard@nvidia.com> > --- > arch/riscv/mm/hugetlbpage.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c > index 542883b3b49b..96225a8533ad 100644 > --- a/arch/riscv/mm/hugetlbpage.c > +++ b/arch/riscv/mm/hugetlbpage.c > @@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, > } > > out: > - WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte)); > + if (pte) { > + pte_t pteval = ptep_get_lockless(pte); I think ptep_get_lockless() on riscv (even riscv32) will always just be ptep_get(), since pte_t is unsigned long, which can be read atomically. > + > + WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval)); Ensuring we only read the pte once is good though. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Thanks, drew > + } > return pte; > } > > > base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd > -- > 2.41.0 >
On 03/07/2023 20:00, John Hubbard wrote: > The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible > to false positives, because the pte is read twice at the C language > level, locklessly, within the same conditional statement. Depending on > compiler behavior, this can lead to generated machine code that actually > reads the pte just once, or twice. Reading twice will expose the code to > changing pte values and cause incorrect behavior. > > In [1], similar code actually caused a kernel crash on 64-bit x86, when > using clang to build the kernel, but only after the conversion from *pte > reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a > double read of *pte. > > Rather than waiting for the upcoming ptep_get() conversion, just convert I'm not sure there is any upcoming ptep_get() conversion for riscv? Not from me at least - my focus was on the generic code to suficiently encapsulate it as an enabler for some follow on arm64 changes. > this part of the code now, but in a way that avoids the above problem: > take a single snapshot of the pte before using it in the WARN > conditional. > > As expected, this preparatory step does not actually change the > generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc > 12.2 cross compiler. > > [1] https://lore.kernel.org/20230630013203.1955064-1-jhubbard@nvidia.com > > Suggested-by: James Houghton <jthoughton@google.com> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> > --- > arch/riscv/mm/hugetlbpage.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c > index 542883b3b49b..96225a8533ad 100644 > --- a/arch/riscv/mm/hugetlbpage.c > +++ b/arch/riscv/mm/hugetlbpage.c > @@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, > } > > out: > - WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte)); > + if (pte) { > + pte_t pteval = ptep_get_lockless(pte); > + > + WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval)); > + } > return pte; > } > > > base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd
On Mon, 03 Jul 2023 12:00:44 -0700, John Hubbard wrote: > The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible > to false positives, because the pte is read twice at the C language > level, locklessly, within the same conditional statement. Depending on > compiler behavior, this can lead to generated machine code that actually > reads the pte just once, or twice. Reading twice will expose the code to > changing pte values and cause incorrect behavior. > > [...] Applied, thanks! [1/1] mm: riscv: fix an unsafe pte read in huge_pte_alloc() https://git.kernel.org/palmer/c/62ba41d27612 Best regards,
Hello: This patch was applied to riscv/linux.git (for-next) by Palmer Dabbelt <palmer@rivosinc.com>: On Mon, 3 Jul 2023 12:00:44 -0700 you wrote: > The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible > to false positives, because the pte is read twice at the C language > level, locklessly, within the same conditional statement. Depending on > compiler behavior, this can lead to generated machine code that actually > reads the pte just once, or twice. Reading twice will expose the code to > changing pte values and cause incorrect behavior. > > [...] Here is the summary with links: - mm: riscv: fix an unsafe pte read in huge_pte_alloc() https://git.kernel.org/riscv/c/62ba41d27612 You are awesome, thank you!
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index 542883b3b49b..96225a8533ad 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, } out: - WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte)); + if (pte) { + pte_t pteval = ptep_get_lockless(pte); + + WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval)); + } return pte; }
The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible to false positives, because the pte is read twice at the C language level, locklessly, within the same conditional statement. Depending on compiler behavior, this can lead to generated machine code that actually reads the pte just once, or twice. Reading twice will expose the code to changing pte values and cause incorrect behavior. In [1], similar code actually caused a kernel crash on 64-bit x86, when using clang to build the kernel, but only after the conversion from *pte reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a double read of *pte. Rather than waiting for the upcoming ptep_get() conversion, just convert this part of the code now, but in a way that avoids the above problem: take a single snapshot of the pte before using it in the WARN conditional. As expected, this preparatory step does not actually change the generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc 12.2 cross compiler. [1] https://lore.kernel.org/20230630013203.1955064-1-jhubbard@nvidia.com Suggested-by: James Houghton <jthoughton@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> --- arch/riscv/mm/hugetlbpage.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd