diff mbox series

[V3,3/4] arm64/mm: Inhibit huge-vmap with ptdump

Message ID 1557824407-19092-4-git-send-email-anshuman.khandual@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64/mm: Enable memory hot remove | expand

Commit Message

Anshuman Khandual May 14, 2019, 9 a.m. UTC
From: Mark Rutland <mark.rutland@arm.com>

The arm64 ptdump code can race with concurrent modification of the
kernel page tables. At the time this was added, this was sound as:

* Modifications to leaf entries could result in stale information being
  logged, but would not result in a functional problem.

* Boot time modifications to non-leaf entries (e.g. freeing of initmem)
  were performed when the ptdump code cannot be invoked.

* At runtime, modifications to non-leaf entries only occurred in the
  vmalloc region, and these were strictly additive, as intermediate
  entries were never freed.

However, since commit:

  commit 324420bf91f6 ("arm64: add support for ioremap() block mappings")

... it has been possible to create huge mappings in the vmalloc area at
runtime, and as part of this existing intermediate levels of table my be
removed and freed.

It's possible for the ptdump code to race with this, and continue to
walk tables which have been freed (and potentially poisoned or
reallocated). As a result of this, the ptdump code may dereference bogus
addresses, which could be fatal.

Since huge-vmap is a TLB and memory optimization, we can disable it when
the runtime ptdump code is in use to avoid this problem.

Fixes: 324420bf91f60582 ("arm64: add support for ioremap() block mappings")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/mm/mmu.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

David Hildenbrand May 14, 2019, 9:20 a.m. UTC | #1
On 14.05.19 11:00, Anshuman Khandual wrote:
> From: Mark Rutland <mark.rutland@arm.com>
> 
> The arm64 ptdump code can race with concurrent modification of the
> kernel page tables. At the time this was added, this was sound as:
> 
> * Modifications to leaf entries could result in stale information being
>   logged, but would not result in a functional problem.
> 
> * Boot time modifications to non-leaf entries (e.g. freeing of initmem)
>   were performed when the ptdump code cannot be invoked.
> 
> * At runtime, modifications to non-leaf entries only occurred in the
>   vmalloc region, and these were strictly additive, as intermediate
>   entries were never freed.
> 
> However, since commit:
> 
>   commit 324420bf91f6 ("arm64: add support for ioremap() block mappings")
> 
> ... it has been possible to create huge mappings in the vmalloc area at
> runtime, and as part of this existing intermediate levels of table my be
> removed and freed.
> 
> It's possible for the ptdump code to race with this, and continue to
> walk tables which have been freed (and potentially poisoned or
> reallocated). As a result of this, the ptdump code may dereference bogus
> addresses, which could be fatal.
> 
> Since huge-vmap is a TLB and memory optimization, we can disable it when
> the runtime ptdump code is in use to avoid this problem.

I wonder if there is another way to protect from such a race happening.
(IOW, a lock). But as you say, it's a debug feature, so this is an easy fix.

Looks good to me (with limited arm64 code insight :) )

> 
> Fixes: 324420bf91f60582 ("arm64: add support for ioremap() block mappings")
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm64/mm/mmu.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index ef82312..37a902c 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -955,13 +955,18 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
>  
>  int __init arch_ioremap_pud_supported(void)
>  {
> -	/* only 4k granule supports level 1 block mappings */
> -	return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
> +	/*
> +	 * Only 4k granule supports level 1 block mappings.
> +	 * SW table walks can't handle removal of intermediate entries.
> +	 */
> +	return IS_ENABLED(CONFIG_ARM64_4K_PAGES) &&
> +	       !IS_ENABLED(CONFIG_ARM64_PTDUMP_DEBUGFS);
>  }
>  
>  int __init arch_ioremap_pmd_supported(void)
>  {
> -	return 1;
> +	/* See arch_ioremap_pud_supported() */
> +	return !IS_ENABLED(CONFIG_ARM64_PTDUMP_DEBUGFS);
>  }
>  
>  int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
>
Ard Biesheuvel May 16, 2019, 8:38 a.m. UTC | #2
On Tue, 14 May 2019 at 11:00, Anshuman Khandual
<anshuman.khandual@arm.com> wrote:
>
> From: Mark Rutland <mark.rutland@arm.com>
>
> The arm64 ptdump code can race with concurrent modification of the
> kernel page tables. At the time this was added, this was sound as:
>
> * Modifications to leaf entries could result in stale information being
>   logged, but would not result in a functional problem.
>
> * Boot time modifications to non-leaf entries (e.g. freeing of initmem)
>   were performed when the ptdump code cannot be invoked.
>
> * At runtime, modifications to non-leaf entries only occurred in the
>   vmalloc region, and these were strictly additive, as intermediate
>   entries were never freed.
>
> However, since commit:
>
>   commit 324420bf91f6 ("arm64: add support for ioremap() block mappings")
>
> ... it has been possible to create huge mappings in the vmalloc area at
> runtime, and as part of this existing intermediate levels of table my be
> removed and freed.
>
> It's possible for the ptdump code to race with this, and continue to
> walk tables which have been freed (and potentially poisoned or
> reallocated). As a result of this, the ptdump code may dereference bogus
> addresses, which could be fatal.
>
> Since huge-vmap is a TLB and memory optimization, we can disable it when
> the runtime ptdump code is in use to avoid this problem.
>

The reason I added this originally is so that we don't trigger a
warning when unmapping vmappings that use 2 MB block mappings, which
happens when we free the kernel's .init segment. The ability to create
such mappings is indeed an optimization that the kernel mapping code
does not rely on.


> Fixes: 324420bf91f60582 ("arm64: add support for ioremap() block mappings")
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>

Acked-by: Ard Biesheuvel <ard.biesheuvel@arm.com>

> ---
>  arch/arm64/mm/mmu.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index ef82312..37a902c 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -955,13 +955,18 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
>
>  int __init arch_ioremap_pud_supported(void)
>  {
> -       /* only 4k granule supports level 1 block mappings */
> -       return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
> +       /*
> +        * Only 4k granule supports level 1 block mappings.
> +        * SW table walks can't handle removal of intermediate entries.
> +        */
> +       return IS_ENABLED(CONFIG_ARM64_4K_PAGES) &&
> +              !IS_ENABLED(CONFIG_ARM64_PTDUMP_DEBUGFS);
>  }
>
>  int __init arch_ioremap_pmd_supported(void)
>  {
> -       return 1;
> +       /* See arch_ioremap_pud_supported() */
> +       return !IS_ENABLED(CONFIG_ARM64_PTDUMP_DEBUGFS);
>  }
>
>  int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)
> --
> 2.7.4
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff mbox series

Patch

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index ef82312..37a902c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -955,13 +955,18 @@  void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 
 int __init arch_ioremap_pud_supported(void)
 {
-	/* only 4k granule supports level 1 block mappings */
-	return IS_ENABLED(CONFIG_ARM64_4K_PAGES);
+	/*
+	 * Only 4k granule supports level 1 block mappings.
+	 * SW table walks can't handle removal of intermediate entries.
+	 */
+	return IS_ENABLED(CONFIG_ARM64_4K_PAGES) &&
+	       !IS_ENABLED(CONFIG_ARM64_PTDUMP_DEBUGFS);
 }
 
 int __init arch_ioremap_pmd_supported(void)
 {
-	return 1;
+	/* See arch_ioremap_pud_supported() */
+	return !IS_ENABLED(CONFIG_ARM64_PTDUMP_DEBUGFS);
 }
 
 int pud_set_huge(pud_t *pudp, phys_addr_t phys, pgprot_t prot)