diff mbox series

arm: flush: don't abuse pfn_valid() to check if pfn is in RAM

Message ID 20240131125907.1006760-1-liuyongqiang13@huawei.com (mailing list archive)
State New, archived
Headers show
Series arm: flush: don't abuse pfn_valid() to check if pfn is in RAM | expand

Commit Message

Yongqiang Liu Jan. 31, 2024, 12:59 p.m. UTC
Since commit a4d5613c4dc6 ("arm: extend pfn_valid to take into account
freed memory map alignment") changes the semantics of pfn_valid() to check
presence of the memory map for a PFN. __sync_icache_dcache() should use
memblock_is_map_memory() instead of pfn_valid() to check if a PFN is in
RAM or not.In Some uio case we will get a crash on a system with the
following memory layout:

 node   0: [mem 0x00000000c0a00000-0x00000000cc8fffff]
 node   0: [mem 0x00000000d0000000-0x00000000da1fffff]
 the uio layout is:0xc0900000, 0x100000

the crash backtrace like:

  Unable to handle kernel paging request at virtual address bff00000
  [...]
  CPU: 1 PID: 465 Comm: startapp.bin Tainted: G           O      5.10.0 #1
  Hardware name: Generic DT based system
  PC is at b15_flush_kern_dcache_area+0x24/0x3c
  LR is at __sync_icache_dcache+0x6c/0x98
  [...]
   (b15_flush_kern_dcache_area) from (__sync_icache_dcache+0x6c/0x98)
   (__sync_icache_dcache) from (set_pte_at+0x28/0x54)
   (set_pte_at) from (remap_pfn_range+0x1a0/0x274)
   (remap_pfn_range) from (uio_mmap+0x184/0x1b8 [uio])
   (uio_mmap [uio]) from (__mmap_region+0x264/0x5f4)
   (__mmap_region) from (__do_mmap_mm+0x3ec/0x440)
   (__do_mmap_mm) from (do_mmap+0x50/0x58)
   (do_mmap) from (vm_mmap_pgoff+0xfc/0x188)
   (vm_mmap_pgoff) from (ksys_mmap_pgoff+0xac/0xc4)
   (ksys_mmap_pgoff) from (ret_fast_syscall+0x0/0x5c)
  Code: e0801001 e2423001 e1c00003 f57ff04f (ee070f3e)
  ---[ end trace 09cf0734c3805d52 ]---
  Kernel panic - not syncing: Fatal exception

Fixes: a4d5613c4dc6 ("arm: extend pfn_valid to take into account freed memory map alignment")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
---
 arch/arm/mm/flush.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Florian Fainelli Jan. 31, 2024, 4:16 p.m. UTC | #1
On 1/31/2024 4:59 AM, Yongqiang Liu wrote:
> Since commit a4d5613c4dc6 ("arm: extend pfn_valid to take into account
> freed memory map alignment") changes the semantics of pfn_valid() to check
> presence of the memory map for a PFN. __sync_icache_dcache() should use
> memblock_is_map_memory() instead of pfn_valid() to check if a PFN is in
> RAM or not.In Some uio case we will get a crash on a system with the
> following memory layout:
> 
>   node   0: [mem 0x00000000c0a00000-0x00000000cc8fffff]
>   node   0: [mem 0x00000000d0000000-0x00000000da1fffff]
>   the uio layout is:0xc0900000, 0x100000
> 
> the crash backtrace like:
> 
>    Unable to handle kernel paging request at virtual address bff00000
>    [...]
>    CPU: 1 PID: 465 Comm: startapp.bin Tainted: G           O      5.10.0 #1
>    Hardware name: Generic DT based system
>    PC is at b15_flush_kern_dcache_area+0x24/0x3c

Humm, what Broadcom platform using a Brahma-B15 CPU are you using out of 
curiosity?

>    LR is at __sync_icache_dcache+0x6c/0x98
>    [...]
>     (b15_flush_kern_dcache_area) from (__sync_icache_dcache+0x6c/0x98)
>     (__sync_icache_dcache) from (set_pte_at+0x28/0x54)
>     (set_pte_at) from (remap_pfn_range+0x1a0/0x274)
>     (remap_pfn_range) from (uio_mmap+0x184/0x1b8 [uio])
>     (uio_mmap [uio]) from (__mmap_region+0x264/0x5f4)
>     (__mmap_region) from (__do_mmap_mm+0x3ec/0x440)
>     (__do_mmap_mm) from (do_mmap+0x50/0x58)
>     (do_mmap) from (vm_mmap_pgoff+0xfc/0x188)
>     (vm_mmap_pgoff) from (ksys_mmap_pgoff+0xac/0xc4)
>     (ksys_mmap_pgoff) from (ret_fast_syscall+0x0/0x5c)
>    Code: e0801001 e2423001 e1c00003 f57ff04f (ee070f3e)
>    ---[ end trace 09cf0734c3805d52 ]---
>    Kernel panic - not syncing: Fatal exception
> 
> Fixes: a4d5613c4dc6 ("arm: extend pfn_valid to take into account freed memory map alignment")
> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> ---
>   arch/arm/mm/flush.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index d19d140a10c7..11ec6c5ff5fc 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -15,6 +15,7 @@
>   #include <asm/smp_plat.h>
>   #include <asm/tlbflush.h>
>   #include <linux/hugetlb.h>
> +#include <linux/memblock.h>
>   
>   #include "mm.h"
>   
> @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
>   		/* only flush non-aliasing VIPT caches for exec mappings */
>   		return;
>   	pfn = pte_pfn(pteval);
> -	if (!pfn_valid(pfn))
> +	if (!memblock_is_map_memory(PFN_PHYS(pfn)))
>   		return;
>   
>   	folio = page_folio(pfn_to_page(pfn));
Robin Murphy Jan. 31, 2024, 6:39 p.m. UTC | #2
On 31/01/2024 12:59 pm, Yongqiang Liu wrote:
> Since commit a4d5613c4dc6 ("arm: extend pfn_valid to take into account
> freed memory map alignment") changes the semantics of pfn_valid() to check
> presence of the memory map for a PFN. __sync_icache_dcache() should use
> memblock_is_map_memory() instead of pfn_valid() to check if a PFN is in
> RAM or not.In Some uio case we will get a crash on a system with the
> following memory layout:
> 
>   node   0: [mem 0x00000000c0a00000-0x00000000cc8fffff]
>   node   0: [mem 0x00000000d0000000-0x00000000da1fffff]
>   the uio layout is:0xc0900000, 0x100000
> 
> the crash backtrace like:
> 
>    Unable to handle kernel paging request at virtual address bff00000
>    [...]
>    CPU: 1 PID: 465 Comm: startapp.bin Tainted: G           O      5.10.0 #1
>    Hardware name: Generic DT based system
>    PC is at b15_flush_kern_dcache_area+0x24/0x3c
>    LR is at __sync_icache_dcache+0x6c/0x98
>    [...]
>     (b15_flush_kern_dcache_area) from (__sync_icache_dcache+0x6c/0x98)
>     (__sync_icache_dcache) from (set_pte_at+0x28/0x54)
>     (set_pte_at) from (remap_pfn_range+0x1a0/0x274)
>     (remap_pfn_range) from (uio_mmap+0x184/0x1b8 [uio])
>     (uio_mmap [uio]) from (__mmap_region+0x264/0x5f4)
>     (__mmap_region) from (__do_mmap_mm+0x3ec/0x440)
>     (__do_mmap_mm) from (do_mmap+0x50/0x58)
>     (do_mmap) from (vm_mmap_pgoff+0xfc/0x188)
>     (vm_mmap_pgoff) from (ksys_mmap_pgoff+0xac/0xc4)
>     (ksys_mmap_pgoff) from (ret_fast_syscall+0x0/0x5c)
>    Code: e0801001 e2423001 e1c00003 f57ff04f (ee070f3e)
>    ---[ end trace 09cf0734c3805d52 ]---
>    Kernel panic - not syncing: Fatal exception
> 
> Fixes: a4d5613c4dc6 ("arm: extend pfn_valid to take into account freed memory map alignment")
> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
> ---
>   arch/arm/mm/flush.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
> index d19d140a10c7..11ec6c5ff5fc 100644
> --- a/arch/arm/mm/flush.c
> +++ b/arch/arm/mm/flush.c
> @@ -15,6 +15,7 @@
>   #include <asm/smp_plat.h>
>   #include <asm/tlbflush.h>
>   #include <linux/hugetlb.h>
> +#include <linux/memblock.h>
>   
>   #include "mm.h"
>   
> @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
>   		/* only flush non-aliasing VIPT caches for exec mappings */
>   		return;
>   	pfn = pte_pfn(pteval);
> -	if (!pfn_valid(pfn))
> +	if (!memblock_is_map_memory(PFN_PHYS(pfn)))
>   		return;
>   
>   	folio = page_folio(pfn_to_page(pfn));

Hmm, it's a bit odd in context, since pfn_valid() obviously pairs with 
this pfn_to_page(), whereas it's not necessarily clear that 
memblock_is_map_memory() implies pfn_valid().

However, in this case we're starting from a PTE - rather than going off 
to do a slow scan of memblock to determine whether a round-trip through 
page_address() is going to give back a mapped VA, can we not trivially 
identify that from whether the PTE itself is valid?

Thanks,
Robin.
Russell King (Oracle) Jan. 31, 2024, 7 p.m. UTC | #3
On Wed, Jan 31, 2024 at 06:39:31PM +0000, Robin Murphy wrote:
> On 31/01/2024 12:59 pm, Yongqiang Liu wrote:
> > @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
> >   		/* only flush non-aliasing VIPT caches for exec mappings */
> >   		return;
> >   	pfn = pte_pfn(pteval);
> > -	if (!pfn_valid(pfn))
> > +	if (!memblock_is_map_memory(PFN_PHYS(pfn)))
> >   		return;
> >   	folio = page_folio(pfn_to_page(pfn));
> 
> Hmm, it's a bit odd in context, since pfn_valid() obviously pairs with this
> pfn_to_page(), whereas it's not necessarily clear that
> memblock_is_map_memory() implies pfn_valid().
> 
> However, in this case we're starting from a PTE - rather than going off to
> do a slow scan of memblock to determine whether a round-trip through
> page_address() is going to give back a mapped VA, can we not trivially
> identify that from whether the PTE itself is valid?

Depends what you mean by "valid". If you're referring to pte_valid()
and L_PTE_VALID then no.

On 32-bit non-LPAE, the valid bit is the same as the present bit, and
needs to be set for the PTE to not fault. Any PTE that is mapping
something will be "valid" whether it is memory or not, whether it is
backed by a page or not.

pfn_valid() should be telling us whether the PFN is suitable to be
passed to pfn_to_page(), and if we have a situation where pfn_valid()
returns true, but pfn_to_page() returns an invalid page, then that in
itself is a bug that needs to be fixed and probably has far reaching
implications for the stability of the kernel.
Robin Murphy Jan. 31, 2024, 9:20 p.m. UTC | #4
On 2024-01-31 7:00 pm, Russell King (Oracle) wrote:
> On Wed, Jan 31, 2024 at 06:39:31PM +0000, Robin Murphy wrote:
>> On 31/01/2024 12:59 pm, Yongqiang Liu wrote:
>>> @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
>>>    		/* only flush non-aliasing VIPT caches for exec mappings */
>>>    		return;
>>>    	pfn = pte_pfn(pteval);
>>> -	if (!pfn_valid(pfn))
>>> +	if (!memblock_is_map_memory(PFN_PHYS(pfn)))
>>>    		return;
>>>    	folio = page_folio(pfn_to_page(pfn));
>>
>> Hmm, it's a bit odd in context, since pfn_valid() obviously pairs with this
>> pfn_to_page(), whereas it's not necessarily clear that
>> memblock_is_map_memory() implies pfn_valid().
>>
>> However, in this case we're starting from a PTE - rather than going off to
>> do a slow scan of memblock to determine whether a round-trip through
>> page_address() is going to give back a mapped VA, can we not trivially
>> identify that from whether the PTE itself is valid?
> 
> Depends what you mean by "valid". If you're referring to pte_valid()
> and L_PTE_VALID then no.
> 
> On 32-bit non-LPAE, the valid bit is the same as the present bit, and
> needs to be set for the PTE to not fault. Any PTE that is mapping
> something will be "valid" whether it is memory or not, whether it is
> backed by a page or not.
> 
> pfn_valid() should be telling us whether the PFN is suitable to be
> passed to pfn_to_page(), and if we have a situation where pfn_valid()
> returns true, but pfn_to_page() returns an invalid page, then that in
> itself is a bug that needs to be fixed and probably has far reaching
> implications for the stability of the kernel.

Right, the problem here seems to be the opposite one, wherein we *do* 
often have a valid struct page for an address which is reserved and thus 
not mapped by the kernel, but seemingly we then take it down a path 
which assumes anything !PageHighmem() is lowmem and dereferences 
page_address() without looking.

However I realise I should have looked closer at the caller, and my idea 
is futile since the PTE here is for a userspace mapping, not a kernel 
VA, and is already pte_valid_user() && !pte_special(). Plus the fact 
that the stack trace indicates an mmap() path suggests it most likely is 
a legitimate mapping of some no-map carveout or MMIO region. Oh well. My 
first point still stands, though - I think at least a comment to clarify 
that assumption would be warranted.

Thanks,
Robin.
Yongqiang Liu Feb. 1, 2024, 8 a.m. UTC | #5
Very appreciate it for extra explanation. Notice that commit 024591f9a6e0

("arm: ioremap: don't abuse pfn_valid() to check if pfn is in RAM") use

memblock_is_map_memory() instead of pfn_valid() to check if a PFN is in

RAM or not, so I wrote the patch to solve this case.  Otherwise, when we

use pageblock align(4M) address of memory or uio, like :

      node   0: [mem 0x00000000c0c00000-0x00000000cc8fffff]
      node   0: [mem 0x00000000d0000000-0x00000000da1fffff]

or uio address set like:

    0xc0400000, 0x100000

the pfn_valid will return false as memblock_is_map_memory.

在 2024/2/1 5:20, Robin Murphy 写道:
> On 2024-01-31 7:00 pm, Russell King (Oracle) wrote:
>> On Wed, Jan 31, 2024 at 06:39:31PM +0000, Robin Murphy wrote:
>>> On 31/01/2024 12:59 pm, Yongqiang Liu wrote:
>>>> @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
>>>>            /* only flush non-aliasing VIPT caches for exec mappings */
>>>>            return;
>>>>        pfn = pte_pfn(pteval);
>>>> -    if (!pfn_valid(pfn))
>>>> +    if (!memblock_is_map_memory(PFN_PHYS(pfn)))
>>>>            return;
>>>>        folio = page_folio(pfn_to_page(pfn));
>>>
>>> Hmm, it's a bit odd in context, since pfn_valid() obviously pairs 
>>> with this
>>> pfn_to_page(), whereas it's not necessarily clear that
>>> memblock_is_map_memory() implies pfn_valid().
>>>
>>> However, in this case we're starting from a PTE - rather than going 
>>> off to
>>> do a slow scan of memblock to determine whether a round-trip through
>>> page_address() is going to give back a mapped VA, can we not trivially
>>> identify that from whether the PTE itself is valid?
>>
>> Depends what you mean by "valid". If you're referring to pte_valid()
>> and L_PTE_VALID then no.
>>
>> On 32-bit non-LPAE, the valid bit is the same as the present bit, and
>> needs to be set for the PTE to not fault. Any PTE that is mapping
>> something will be "valid" whether it is memory or not, whether it is
>> backed by a page or not.
>>
>> pfn_valid() should be telling us whether the PFN is suitable to be
>> passed to pfn_to_page(), and if we have a situation where pfn_valid()
>> returns true, but pfn_to_page() returns an invalid page, then that in
>> itself is a bug that needs to be fixed and probably has far reaching
>> implications for the stability of the kernel.
>
> Right, the problem here seems to be the opposite one, wherein we *do* 
> often have a valid struct page for an address which is reserved and 
> thus not mapped by the kernel, but seemingly we then take it down a 
> path which assumes anything !PageHighmem() is lowmem and dereferences 
> page_address() without looking.
>
> However I realise I should have looked closer at the caller, and my 
> idea is futile since the PTE here is for a userspace mapping, not a 
> kernel VA, and is already pte_valid_user() && !pte_special(). Plus the 
> fact that the stack trace indicates an mmap() path suggests it most 
> likely is a legitimate mapping of some no-map carveout or MMIO region. 
> Oh well. My first point still stands, though - I think at least a 
> comment to clarify that assumption would be warranted.
>
> Thanks,
> Robin.
> .
Mike Rapoport Feb. 1, 2024, 9 a.m. UTC | #6
Hi,

Please don't top-post to Linux mailing lists.

On Thu, Feb 01, 2024 at 04:00:04PM +0800, Yongqiang Liu wrote:
> Very appreciate it for extra explanation. Notice that commit 024591f9a6e0
> 
> ("arm: ioremap: don't abuse pfn_valid() to check if pfn is in RAM") use
> 
> memblock_is_map_memory() instead of pfn_valid() to check if a PFN is in
> 
> RAM or not, so I wrote the patch to solve this case.  Otherwise, when we
> 
> use pageblock align(4M) address of memory or uio, like :
> 
>      node   0: [mem 0x00000000c0c00000-0x00000000cc8fffff]
>      node   0: [mem 0x00000000d0000000-0x00000000da1fffff]
> 
> or uio address set like:
> 
>    0xc0400000, 0x100000
> 
> the pfn_valid will return false as memblock_is_map_memory.

pfn_valid() should return false if and only if there is no struct page for
that pfn.

My understanding is that struct pages exist for the range of UIO addresses,
and hopefully they have PG_reserved bit set, so a better fix IMO would be
to check if the folio is !reserved.
 
> 在 2024/2/1 5:20, Robin Murphy 写道:
> > On 2024-01-31 7:00 pm, Russell King (Oracle) wrote:
> > > On Wed, Jan 31, 2024 at 06:39:31PM +0000, Robin Murphy wrote:
> > > > On 31/01/2024 12:59 pm, Yongqiang Liu wrote:
> > > > > @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
> > > > >            /* only flush non-aliasing VIPT caches for exec mappings */
> > > > >            return;
> > > > >        pfn = pte_pfn(pteval);
> > > > > -    if (!pfn_valid(pfn))
> > > > > +    if (!memblock_is_map_memory(PFN_PHYS(pfn)))
> > > > >            return;
> > > > >        folio = page_folio(pfn_to_page(pfn));
> > > > 
> > > > Hmm, it's a bit odd in context, since pfn_valid() obviously
> > > > pairs with this
> > > > pfn_to_page(), whereas it's not necessarily clear that
> > > > memblock_is_map_memory() implies pfn_valid().
> > > > 
> > > > However, in this case we're starting from a PTE - rather than
> > > > going off to
> > > > do a slow scan of memblock to determine whether a round-trip through
> > > > page_address() is going to give back a mapped VA, can we not trivially
> > > > identify that from whether the PTE itself is valid?
> > > 
> > > Depends what you mean by "valid". If you're referring to pte_valid()
> > > and L_PTE_VALID then no.
> > > 
> > > On 32-bit non-LPAE, the valid bit is the same as the present bit, and
> > > needs to be set for the PTE to not fault. Any PTE that is mapping
> > > something will be "valid" whether it is memory or not, whether it is
> > > backed by a page or not.
> > > 
> > > pfn_valid() should be telling us whether the PFN is suitable to be
> > > passed to pfn_to_page(), and if we have a situation where pfn_valid()
> > > returns true, but pfn_to_page() returns an invalid page, then that in
> > > itself is a bug that needs to be fixed and probably has far reaching
> > > implications for the stability of the kernel.
> > 
> > Right, the problem here seems to be the opposite one, wherein we *do*
> > often have a valid struct page for an address which is reserved and thus
> > not mapped by the kernel, but seemingly we then take it down a path
> > which assumes anything !PageHighmem() is lowmem and dereferences
> > page_address() without looking.
> > 
> > However I realise I should have looked closer at the caller, and my idea
> > is futile since the PTE here is for a userspace mapping, not a kernel
> > VA, and is already pte_valid_user() && !pte_special(). Plus the fact
> > that the stack trace indicates an mmap() path suggests it most likely is
> > a legitimate mapping of some no-map carveout or MMIO region. Oh well. My
> > first point still stands, though - I think at least a comment to clarify
> > that assumption would be warranted.
> > 
> > Thanks,
> > Robin.
> > .
Yongqiang Liu Feb. 2, 2024, 3:19 a.m. UTC | #7
在 2024/2/1 17:00, Mike Rapoport 写道:
> Hi,
>
> Please don't top-post to Linux mailing lists.
>
> On Thu, Feb 01, 2024 at 04:00:04PM +0800, Yongqiang Liu wrote:
>> Very appreciate it for extra explanation. Notice that commit 024591f9a6e0
>>
>> ("arm: ioremap: don't abuse pfn_valid() to check if pfn is in RAM") use
>>
>> memblock_is_map_memory() instead of pfn_valid() to check if a PFN is in
>>
>> RAM or not, so I wrote the patch to solve this case.  Otherwise, when we
>>
>> use pageblock align(4M) address of memory or uio, like :
>>
>>       node   0: [mem 0x00000000c0c00000-0x00000000cc8fffff]
>>       node   0: [mem 0x00000000d0000000-0x00000000da1fffff]
>>
>> or uio address set like:
>>
>>     0xc0400000, 0x100000
>>
>> the pfn_valid will return false as memblock_is_map_memory.
> pfn_valid() should return false if and only if there is no struct page for
> that pfn.
>
> My understanding is that struct pages exist for the range of UIO addresses,
> and hopefully they have PG_reserved bit set, so a better fix IMO would be
> to check if the folio is !reserved.
>   

Thanks! All of the  UIO pages have PG_reserved bit set. I'm also 
confused about

whether other cases have the same issue like dma or someone using 
virt_addr_valid.

>> 在 2024/2/1 5:20, Robin Murphy 写道:
>>> On 2024-01-31 7:00 pm, Russell King (Oracle) wrote:
>>>> On Wed, Jan 31, 2024 at 06:39:31PM +0000, Robin Murphy wrote:
>>>>> On 31/01/2024 12:59 pm, Yongqiang Liu wrote:
>>>>>> @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
>>>>>>             /* only flush non-aliasing VIPT caches for exec mappings */
>>>>>>             return;
>>>>>>         pfn = pte_pfn(pteval);
>>>>>> -    if (!pfn_valid(pfn))
>>>>>> +    if (!memblock_is_map_memory(PFN_PHYS(pfn)))
>>>>>>             return;
>>>>>>         folio = page_folio(pfn_to_page(pfn));
>>>>> Hmm, it's a bit odd in context, since pfn_valid() obviously
>>>>> pairs with this
>>>>> pfn_to_page(), whereas it's not necessarily clear that
>>>>> memblock_is_map_memory() implies pfn_valid().
>>>>>
>>>>> However, in this case we're starting from a PTE - rather than
>>>>> going off to
>>>>> do a slow scan of memblock to determine whether a round-trip through
>>>>> page_address() is going to give back a mapped VA, can we not trivially
>>>>> identify that from whether the PTE itself is valid?
>>>> Depends what you mean by "valid". If you're referring to pte_valid()
>>>> and L_PTE_VALID then no.
>>>>
>>>> On 32-bit non-LPAE, the valid bit is the same as the present bit, and
>>>> needs to be set for the PTE to not fault. Any PTE that is mapping
>>>> something will be "valid" whether it is memory or not, whether it is
>>>> backed by a page or not.
>>>>
>>>> pfn_valid() should be telling us whether the PFN is suitable to be
>>>> passed to pfn_to_page(), and if we have a situation where pfn_valid()
>>>> returns true, but pfn_to_page() returns an invalid page, then that in
>>>> itself is a bug that needs to be fixed and probably has far reaching
>>>> implications for the stability of the kernel.
>>> Right, the problem here seems to be the opposite one, wherein we *do*
>>> often have a valid struct page for an address which is reserved and thus
>>> not mapped by the kernel, but seemingly we then take it down a path
>>> which assumes anything !PageHighmem() is lowmem and dereferences
>>> page_address() without looking.
>>>
>>> However I realise I should have looked closer at the caller, and my idea
>>> is futile since the PTE here is for a userspace mapping, not a kernel
>>> VA, and is already pte_valid_user() && !pte_special(). Plus the fact
>>> that the stack trace indicates an mmap() path suggests it most likely is
>>> a legitimate mapping of some no-map carveout or MMIO region. Oh well. My
>>> first point still stands, though - I think at least a comment to clarify
>>> that assumption would be warranted.
>>>
>>> Thanks,
>>> Robin.
>>> .
diff mbox series

Patch

diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c
index d19d140a10c7..11ec6c5ff5fc 100644
--- a/arch/arm/mm/flush.c
+++ b/arch/arm/mm/flush.c
@@ -15,6 +15,7 @@ 
 #include <asm/smp_plat.h>
 #include <asm/tlbflush.h>
 #include <linux/hugetlb.h>
+#include <linux/memblock.h>
 
 #include "mm.h"
 
@@ -292,7 +293,7 @@  void __sync_icache_dcache(pte_t pteval)
 		/* only flush non-aliasing VIPT caches for exec mappings */
 		return;
 	pfn = pte_pfn(pteval);
-	if (!pfn_valid(pfn))
+	if (!memblock_is_map_memory(PFN_PHYS(pfn)))
 		return;
 
 	folio = page_folio(pfn_to_page(pfn));