diff mbox series

mm: migrate: add missing flush_dcache_page for non-mapped page migrate

Message ID 20190219123212.29838-1-larper@axis.com (mailing list archive)
State New, archived
Headers show
Series mm: migrate: add missing flush_dcache_page for non-mapped page migrate | expand

Commit Message

Lars Persson Feb. 19, 2019, 12:32 p.m. UTC
Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
and SIGSEGV that could not be traced back to a userspace code
bug. They had all the magic signs of an I/D cache coherency issue.

Now recently we noticed that the /proc/sys/vm/compact_memory interface
was quite efficient at provoking this class of userspace crashes.

Studying the code in mm/migrate.c there is a distinction made between
migrating a page that is mapped at the instant of migration and one
that is not mapped. Our problem turned out to be the non-mapped pages.

For the non-mapped page the code performs a copy of the page content
and all relevant meta-data of the page without doing the required
D-cache maintenance. This leaves dirty data in the D-cache of the CPU
and on the 1004K cores this data is not visible to the I-cache. A
subsequent page-fault that triggers a mapping of the page will happily
serve the process with potentially stale code.

What about ARM then, this bug should have seen greater exposure? Well
ARM became immune to this flaw back in 2010, see commit c01778001a4f
("ARM: 6379/1: Assume new page cache pages have dirty D-cache").

My proposed fix moves the D-cache maintenance inside move_to_new_page
to make it common for both cases.

Signed-off-by: Lars Persson <larper@axis.com>
---
 mm/migrate.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Mel Gorman Feb. 21, 2019, 5:10 p.m. UTC | #1
On Tue, Feb 19, 2019 at 01:32:12PM +0100, Lars Persson wrote:
> Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
> and SIGSEGV that could not be traced back to a userspace code
> bug. They had all the magic signs of an I/D cache coherency issue.
> 
> Now recently we noticed that the /proc/sys/vm/compact_memory interface
> was quite efficient at provoking this class of userspace crashes.
> 
> Studying the code in mm/migrate.c there is a distinction made between
> migrating a page that is mapped at the instant of migration and one
> that is not mapped. Our problem turned out to be the non-mapped pages.
> 
> For the non-mapped page the code performs a copy of the page content
> and all relevant meta-data of the page without doing the required
> D-cache maintenance. This leaves dirty data in the D-cache of the CPU
> and on the 1004K cores this data is not visible to the I-cache. A
> subsequent page-fault that triggers a mapping of the page will happily
> serve the process with potentially stale code.
> 
> What about ARM then, this bug should have seen greater exposure? Well
> ARM became immune to this flaw back in 2010, see commit c01778001a4f
> ("ARM: 6379/1: Assume new page cache pages have dirty D-cache").
> 
> My proposed fix moves the D-cache maintenance inside move_to_new_page
> to make it common for both cases.
> 
> Signed-off-by: Lars Persson <larper@axis.com>

Acked-by: Mel Gorman <mgorman@techsingularity.net>
Paul Burton Feb. 21, 2019, 8:36 p.m. UTC | #2
Hi Lars,

On Tue, Feb 19, 2019 at 01:32:12PM +0100, Lars Persson wrote:
> Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
> and SIGSEGV that could not be traced back to a userspace code
> bug. They had all the magic signs of an I/D cache coherency issue.
> 
> Now recently we noticed that the /proc/sys/vm/compact_memory interface
> was quite efficient at provoking this class of userspace crashes.
> 
> Studying the code in mm/migrate.c there is a distinction made between
> migrating a page that is mapped at the instant of migration and one
> that is not mapped. Our problem turned out to be the non-mapped pages.
> 
> For the non-mapped page the code performs a copy of the page content
> and all relevant meta-data of the page without doing the required
> D-cache maintenance. This leaves dirty data in the D-cache of the CPU
> and on the 1004K cores this data is not visible to the I-cache. A
> subsequent page-fault that triggers a mapping of the page will happily
> serve the process with potentially stale code.
> 
> What about ARM then, this bug should have seen greater exposure? Well
> ARM became immune to this flaw back in 2010, see commit c01778001a4f
> ("ARM: 6379/1: Assume new page cache pages have dirty D-cache").
> 
> My proposed fix moves the D-cache maintenance inside move_to_new_page
> to make it common for both cases.
> 
> Signed-off-by: Lars Persson <larper@axis.com>

Reviewed-by: Paul Burton <paul.burton@mips.com>

Thanks,
    Paul

> ---
>  mm/migrate.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
Vlastimil Babka Feb. 25, 2019, 3:07 p.m. UTC | #3
On 2/19/19 1:32 PM, Lars Persson wrote:
> Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
> and SIGSEGV that could not be traced back to a userspace code
> bug. They had all the magic signs of an I/D cache coherency issue.
> 
> Now recently we noticed that the /proc/sys/vm/compact_memory interface
> was quite efficient at provoking this class of userspace crashes.
> 
> Studying the code in mm/migrate.c there is a distinction made between
> migrating a page that is mapped at the instant of migration and one
> that is not mapped. Our problem turned out to be the non-mapped pages.
> 
> For the non-mapped page the code performs a copy of the page content
> and all relevant meta-data of the page without doing the required
> D-cache maintenance. This leaves dirty data in the D-cache of the CPU
> and on the 1004K cores this data is not visible to the I-cache. A
> subsequent page-fault that triggers a mapping of the page will happily
> serve the process with potentially stale code.
> 
> What about ARM then, this bug should have seen greater exposure? Well
> ARM became immune to this flaw back in 2010, see commit c01778001a4f
> ("ARM: 6379/1: Assume new page cache pages have dirty D-cache").
> 
> My proposed fix moves the D-cache maintenance inside move_to_new_page
> to make it common for both cases.
> 
> Signed-off-by: Lars Persson <larper@axis.com>

What about CC stable and a Fixes tag, would it be applicable here?

> ---
>  mm/migrate.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index d4fd680be3b0..80fc19e610b5 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -248,10 +248,8 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
>  				pte = swp_entry_to_pte(entry);
>  			} else if (is_device_public_page(new)) {
>  				pte = pte_mkdevmap(pte);
> -				flush_dcache_page(new);
>  			}
> -		} else
> -			flush_dcache_page(new);
> +		}
>  
>  #ifdef CONFIG_HUGETLB_PAGE
>  		if (PageHuge(new)) {
> @@ -995,6 +993,13 @@ static int move_to_new_page(struct page *newpage, struct page *page,
>  		 */
>  		if (!PageMappingFlags(page))
>  			page->mapping = NULL;
> +
> +		if (unlikely(is_zone_device_page(newpage))) {
> +			if (is_device_public_page(newpage))
> +				flush_dcache_page(newpage);
> +		} else
> +			flush_dcache_page(newpage);
> +
>  	}
>  out:
>  	return rc;
>
Lars Persson Feb. 26, 2019, 8:40 a.m. UTC | #4
On 2/25/19 4:07 PM, Vlastimil Babka wrote:
> On 2/19/19 1:32 PM, Lars Persson wrote:
>> Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
>> and SIGSEGV that could not be traced back to a userspace code
>> bug. They had all the magic signs of an I/D cache coherency issue.
>>
>> Now recently we noticed that the /proc/sys/vm/compact_memory interface
>> was quite efficient at provoking this class of userspace crashes.
>>
>> Studying the code in mm/migrate.c there is a distinction made between
>> migrating a page that is mapped at the instant of migration and one
>> that is not mapped. Our problem turned out to be the non-mapped pages.
>>
>> For the non-mapped page the code performs a copy of the page content
>> and all relevant meta-data of the page without doing the required
>> D-cache maintenance. This leaves dirty data in the D-cache of the CPU
>> and on the 1004K cores this data is not visible to the I-cache. A
>> subsequent page-fault that triggers a mapping of the page will happily
>> serve the process with potentially stale code.
>>
>> What about ARM then, this bug should have seen greater exposure? Well
>> ARM became immune to this flaw back in 2010, see commit c01778001a4f
>> ("ARM: 6379/1: Assume new page cache pages have dirty D-cache").
>>
>> My proposed fix moves the D-cache maintenance inside move_to_new_page
>> to make it common for both cases.
>>
>> Signed-off-by: Lars Persson <larper@axis.com>
> 
> What about CC stable and a Fixes tag, would it be applicable here?
> 

Yes this is candidate for stable so let's add:
Cc: <stable@vger.kernel.org>

I do not find a good candidate for a Fixes tag.
Anshuman Khandual Feb. 26, 2019, 9:23 a.m. UTC | #5
On 02/19/2019 06:02 PM, Lars Persson wrote:
> Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
> and SIGSEGV that could not be traced back to a userspace code
> bug. They had all the magic signs of an I/D cache coherency issue.
> 
> Now recently we noticed that the /proc/sys/vm/compact_memory interface
> was quite efficient at provoking this class of userspace crashes.
> 
> Studying the code in mm/migrate.c there is a distinction made between
> migrating a page that is mapped at the instant of migration and one
> that is not mapped. Our problem turned out to be the non-mapped pages.
> 
> For the non-mapped page the code performs a copy of the page content
> and all relevant meta-data of the page without doing the required
> D-cache maintenance. This leaves dirty data in the D-cache of the CPU
> and on the 1004K cores this data is not visible to the I-cache. A
> subsequent page-fault that triggers a mapping of the page will happily
> serve the process with potentially stale code.

Just curious. Is not the code path which tries to map this page should
do the invalidation just before setting it up in the page table via
set_pte_at() or other similar variants ? How it maps without doing the
necessary flush.
Lars Persson Feb. 26, 2019, 9:46 a.m. UTC | #6
On Tue, Feb 26, 2019 at 10:23 AM Anshuman Khandual
<anshuman.khandual@arm.com> wrote:
> On 02/19/2019 06:02 PM, Lars Persson wrote:
> > Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
> > and SIGSEGV that could not be traced back to a userspace code
> > bug. They had all the magic signs of an I/D cache coherency issue.
> >
> > Now recently we noticed that the /proc/sys/vm/compact_memory interface
> > was quite efficient at provoking this class of userspace crashes.
> >
> > Studying the code in mm/migrate.c there is a distinction made between
> > migrating a page that is mapped at the instant of migration and one
> > that is not mapped. Our problem turned out to be the non-mapped pages.
> >
> > For the non-mapped page the code performs a copy of the page content
> > and all relevant meta-data of the page without doing the required
> > D-cache maintenance. This leaves dirty data in the D-cache of the CPU
> > and on the 1004K cores this data is not visible to the I-cache. A
> > subsequent page-fault that triggers a mapping of the page will happily
> > serve the process with potentially stale code.
>
> Just curious. Is not the code path which tries to map this page should
> do the invalidation just before setting it up in the page table via
> set_pte_at() or other similar variants ? How it maps without doing the
> necessary flush.

In fact this is what happens when the flush_dcache_page API was used
correctly, but it is an arch implementation detail. All kernel code
that writes to a page cage page must also call flush_dcache_page
before the page becomes eligible for mapping. The arch code has the
option to postpone the actual flush until set_pte_at maps the page.
Vlastimil Babka Feb. 26, 2019, 10:07 a.m. UTC | #7
On 2/26/19 9:40 AM, Lars Persson wrote:
>> What about CC stable and a Fixes tag, would it be applicable here?
>>
> 
> Yes this is candidate for stable so let's add:
> Cc: <stable@vger.kernel.org>
> 
> I do not find a good candidate for a Fixes tag.

How bout a version range where the bug needs to be fixed then?
Lars Persson Feb. 26, 2019, 11:57 a.m. UTC | #8
On 2/26/19 11:07 AM, Vlastimil Babka wrote:
> On 2/26/19 9:40 AM, Lars Persson wrote:
>>> What about CC stable and a Fixes tag, would it be applicable here?
>>>
>>
>> Yes this is candidate for stable so let's add:
>> Cc: <stable@vger.kernel.org>
>>
>> I do not find a good candidate for a Fixes tag.
> 
> How bout a version range where the bug needs to be fixed then?
> 

The distinction between mapped and non-mapped old page was introduced in 2ebba6b7e1d9 ("mm: unmapped page migration avoid unmap+remap overhead") so at least it applies to stable 4.4+.

Before that patch there was always a call to remove_migration_ptes() but I cannot conclude if those earlier versions actually will reach the flush_dcache_page call if the old page was unmapped.
Lars Persson March 7, 2019, 2:17 p.m. UTC | #9
On 2/26/19 12:57 PM, Lars Persson wrote:
> 
> 
> On 2/26/19 11:07 AM, Vlastimil Babka wrote:
>> On 2/26/19 9:40 AM, Lars Persson wrote:
>>>> What about CC stable and a Fixes tag, would it be applicable here?
>>>>
>>>
>>> Yes this is candidate for stable so let's add:
>>> Cc: <stable@vger.kernel.org>
>>>
>>> I do not find a good candidate for a Fixes tag.
>>
>> How bout a version range where the bug needs to be fixed then?
>>
> 
> The distinction between mapped and non-mapped old page was introduced in 2ebba6b7e1d9 ("mm: unmapped page migration avoid unmap+remap overhead") so at least it applies to stable 4.4+.
> 
> Before that patch there was always a call to remove_migration_ptes() but I cannot conclude if those earlier versions actually will reach the flush_dcache_page call if the old page was unmapped.
> 

Should I submit a V2 patch with CC stable for v4.4+ ?

- Lars
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index d4fd680be3b0..80fc19e610b5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -248,10 +248,8 @@  static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 				pte = swp_entry_to_pte(entry);
 			} else if (is_device_public_page(new)) {
 				pte = pte_mkdevmap(pte);
-				flush_dcache_page(new);
 			}
-		} else
-			flush_dcache_page(new);
+		}
 
 #ifdef CONFIG_HUGETLB_PAGE
 		if (PageHuge(new)) {
@@ -995,6 +993,13 @@  static int move_to_new_page(struct page *newpage, struct page *page,
 		 */
 		if (!PageMappingFlags(page))
 			page->mapping = NULL;
+
+		if (unlikely(is_zone_device_page(newpage))) {
+			if (is_device_public_page(newpage))
+				flush_dcache_page(newpage);
+		} else
+			flush_dcache_page(newpage);
+
 	}
 out:
 	return rc;