diff mbox series

mm: riscv: fix an unsafe pte read in huge_pte_alloc()

Message ID 20230703190044.311730-1-jhubbard@nvidia.com (mailing list archive)
State New
Headers show
Series mm: riscv: fix an unsafe pte read in huge_pte_alloc() | expand

Commit Message

John Hubbard July 3, 2023, 7 p.m. UTC
The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
to false positives, because the pte is read twice at the C language
level, locklessly, within the same conditional statement. Depending on
compiler behavior, this can lead to generated machine code that actually
reads the pte just once, or twice. Reading twice will expose the code to
changing pte values and cause incorrect behavior.

In [1], similar code actually caused a kernel crash on 64-bit x86, when
using clang to build the kernel, but only after the conversion from *pte
reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a
double read of *pte.

Rather than waiting for the upcoming ptep_get() conversion, just convert
this part of the code now, but in a way that avoids the above problem:
take a single snapshot of the pte before using it in the WARN
conditional.

As expected, this preparatory step does not actually change the
generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc
12.2 cross compiler.

[1] https://lore.kernel.org/20230630013203.1955064-1-jhubbard@nvidia.com

Suggested-by: James Houghton <jthoughton@google.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 arch/riscv/mm/hugetlbpage.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)


base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd

Comments

Andrew Jones July 4, 2023, 6:01 a.m. UTC | #1
On Mon, Jul 03, 2023 at 12:00:44PM -0700, John Hubbard wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
> 
> In [1], similar code actually caused a kernel crash on 64-bit x86, when
> using clang to build the kernel, but only after the conversion from *pte
> reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a
> double read of *pte.
> 
> Rather than waiting for the upcoming ptep_get() conversion, just convert
> this part of the code now, but in a way that avoids the above problem:
> take a single snapshot of the pte before using it in the WARN
> conditional.
> 
> As expected, this preparatory step does not actually change the
> generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc
> 12.2 cross compiler.
> 
> [1] https://lore.kernel.org/20230630013203.1955064-1-jhubbard@nvidia.com
> 
> Suggested-by: James Houghton <jthoughton@google.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  arch/riscv/mm/hugetlbpage.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
> index 542883b3b49b..96225a8533ad 100644
> --- a/arch/riscv/mm/hugetlbpage.c
> +++ b/arch/riscv/mm/hugetlbpage.c
> @@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
>  	}
>  
>  out:
> -	WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte));
> +	if (pte) {
> +		pte_t pteval = ptep_get_lockless(pte);

I think ptep_get_lockless() on riscv (even riscv32) will always just be
ptep_get(), since pte_t is unsigned long, which can be read atomically.

> +
> +		WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval));

Ensuring we only read the pte once is good though.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

Thanks,
drew


> +	}
>  	return pte;
>  }
>  
> 
> base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd
> -- 
> 2.41.0
>
Ryan Roberts July 4, 2023, 7:06 a.m. UTC | #2
On 03/07/2023 20:00, John Hubbard wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
> 
> In [1], similar code actually caused a kernel crash on 64-bit x86, when
> using clang to build the kernel, but only after the conversion from *pte
> reads, to ptep_get(pte). The latter uses READ_ONCE(), which forced a
> double read of *pte.
> 
> Rather than waiting for the upcoming ptep_get() conversion, just convert

I'm not sure there is any upcoming ptep_get() conversion for riscv? Not from me
at least - my focus was on the generic code to suficiently encapsulate it as an
enabler for some follow on arm64 changes.

> this part of the code now, but in a way that avoids the above problem:
> take a single snapshot of the pte before using it in the WARN
> conditional.
> 
> As expected, this preparatory step does not actually change the
> generated code ("make mm/hugetlbpage.s"), on riscv64, when using a gcc
> 12.2 cross compiler.
> 
> [1] https://lore.kernel.org/20230630013203.1955064-1-jhubbard@nvidia.com
> 
> Suggested-by: James Houghton <jthoughton@google.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

> ---
>  arch/riscv/mm/hugetlbpage.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
> index 542883b3b49b..96225a8533ad 100644
> --- a/arch/riscv/mm/hugetlbpage.c
> +++ b/arch/riscv/mm/hugetlbpage.c
> @@ -73,7 +73,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
>  	}
>  
>  out:
> -	WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte));
> +	if (pte) {
> +		pte_t pteval = ptep_get_lockless(pte);
> +
> +		WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval));
> +	}
>  	return pte;
>  }
>  
> 
> base-commit: 0a8d6c9c7128a93689fba384cdd7f72b0ce19abd
Palmer Dabbelt July 5, 2023, 11:38 p.m. UTC | #3
On Mon, 03 Jul 2023 12:00:44 -0700, John Hubbard wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
> 
> [...]

Applied, thanks!

[1/1] mm: riscv: fix an unsafe pte read in huge_pte_alloc()
      https://git.kernel.org/palmer/c/62ba41d27612

Best regards,
patchwork-bot+linux-riscv@kernel.org July 5, 2023, 11:50 p.m. UTC | #4
Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Mon, 3 Jul 2023 12:00:44 -0700 you wrote:
> The WARN_ON_ONCE() statement in riscv's huge_pte_alloc() is susceptible
> to false positives, because the pte is read twice at the C language
> level, locklessly, within the same conditional statement. Depending on
> compiler behavior, this can lead to generated machine code that actually
> reads the pte just once, or twice. Reading twice will expose the code to
> changing pte values and cause incorrect behavior.
> 
> [...]

Here is the summary with links:
  - mm: riscv: fix an unsafe pte read in huge_pte_alloc()
    https://git.kernel.org/riscv/c/62ba41d27612

You are awesome, thank you!
diff mbox series

Patch

diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index 542883b3b49b..96225a8533ad 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -73,7 +73,11 @@  pte_t *huge_pte_alloc(struct mm_struct *mm,
 	}
 
 out:
-	WARN_ON_ONCE(pte && pte_present(*pte) && !pte_huge(*pte));
+	if (pte) {
+		pte_t pteval = ptep_get_lockless(pte);
+
+		WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval));
+	}
 	return pte;
 }