diff mbox series

mm: Fix __dump_page() for poisoned pages

Message ID dbbcd36ca1f045ec81f49c7657928a1cdf24872b.1550065120.git.robin.murphy@arm.com (mailing list archive)
State New, archived
Headers show
Series mm: Fix __dump_page() for poisoned pages | expand

Commit Message

Robin Murphy Feb. 13, 2019, 1:40 p.m. UTC
Evaluating page_mapping() on a poisoned page ends up dereferencing junk
and making PF_POISONED_CHECK() considerably crashier than intended. Fix
that by not inspecting the mapping until we've determined that it's
likely to be valid.

Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 mm/debug.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Michal Hocko Feb. 13, 2019, 2:23 p.m. UTC | #1
On Wed 13-02-19 13:40:49, Robin Murphy wrote:
> Evaluating page_mapping() on a poisoned page ends up dereferencing junk
> and making PF_POISONED_CHECK() considerably crashier than intended. Fix
> that by not inspecting the mapping until we've determined that it's
> likely to be valid.

Has this ever triggered? I am mainly asking because there is no usage of
mapping so I would expect that the compiler wouldn't really call
page_mapping until it is really used.

> Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  mm/debug.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/debug.c b/mm/debug.c
> index 0abb987dad9b..1611cf00a137 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -44,7 +44,7 @@ const struct trace_print_flags vmaflag_names[] = {
>  
>  void __dump_page(struct page *page, const char *reason)
>  {
> -	struct address_space *mapping = page_mapping(page);
> +	struct address_space *mapping;
>  	bool page_poisoned = PagePoisoned(page);
>  	int mapcount;
>  
> @@ -58,6 +58,8 @@ void __dump_page(struct page *page, const char *reason)
>  		goto hex_only;
>  	}
>  
> +	mapping = page_mapping(page);
> +
>  	/*
>  	 * Avoid VM_BUG_ON() in page_mapcount().
>  	 * page->_mapcount space in struct page is used by sl[aou]b pages to
> -- 
> 2.20.1.dirty
>
Robin Murphy Feb. 13, 2019, 2:38 p.m. UTC | #2
On 13/02/2019 14:23, Michal Hocko wrote:
> On Wed 13-02-19 13:40:49, Robin Murphy wrote:
>> Evaluating page_mapping() on a poisoned page ends up dereferencing junk
>> and making PF_POISONED_CHECK() considerably crashier than intended. Fix
>> that by not inspecting the mapping until we've determined that it's
>> likely to be valid.
> 
> Has this ever triggered? I am mainly asking because there is no usage of
> mapping so I would expect that the compiler wouldn't really call
> page_mapping until it is really used.

A function call is a sequence point, so any compiler that did that would 
be totally broken.

The crash looks like this (now from an explicit dump_page() call before 
it happens naturally deep within pfn_to_nid()):
-----
[  107.147056] Unable to handle kernel NULL pointer dereference at 
virtual address 0000000000000006
[  107.155774] Mem abort info:
[  107.158546]   ESR = 0x96000005
[  107.161572]   Exception class = DABT (current EL), IL = 32 bits
[  107.167437]   SET = 0, FnV = 0
[  107.170460]   EA = 0, S1PTW = 0
[  107.173568] Data abort info:
[  107.176419]   ISV = 0, ISS = 0x00000005
[  107.180218]   CM = 0, WnR = 0
[  107.183151] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c2f6ac38
[  107.189702] [0000000000000006] pgd=0000000000000000, pud=0000000000000000
[  107.196430] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[  107.201942] Modules linked in:
[  107.204962] CPU: 2 PID: 491 Comm: bash Not tainted 5.0.0-rc1+ #1
[  107.210903] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Dec 17 2018
[  107.221576] pstate: 00000005 (nzcv daif -PAN -UAO)
[  107.226321] pc : page_mapping+0x18/0x118
[  107.230200] lr : __dump_page+0x1c/0x398
[  107.233990] sp : ffffff8011a53c30
[  107.237265] x29: ffffff8011a53c30 x28: ffffffc039b6ec00
[  107.242520] x27: 0000000000000000 x26: 0000000000000000
[  107.247775] x25: 0000000056000000 x24: 0000000000000015
[  107.253029] x23: ffffff80114d8b18 x22: 0000000000000022
[  107.258283] x21: ffffffc03538ec38 x20: ffffff8011082e78
[  107.263537] x19: ffffffbf20000000 x18: 0000000000000000
[  107.268790] x17: 0000000000000000 x16: 0000000000000000
[  107.274044] x15: 0000000000000000 x14: 0000000000000000
[  107.279297] x13: 0000000000000000 x12: 0000000000000030
[  107.284550] x11: 0000000000000030 x10: 0101010101010101
[  107.289804] x9 : ff7274615e68726c x8 : 7f7f7f7f7f7f7f7f
[  107.295057] x7 : feff64756e6c6471 x6 : 0000000000008080
[  107.300310] x5 : 0000000000000000 x4 : 0000000000000000
[  107.305564] x3 : ffffffc039b6ec00 x2 : fffffffffffffffe
[  107.310817] x1 : ffffffffffffffff x0 : fffffffffffffffe
[  107.316072] Process bash (pid: 491, stack limit = 0x000000004ebd4ecd)
[  107.322442] Call trace:
[  107.324858]  page_mapping+0x18/0x118
[  107.328392]  __dump_page+0x1c/0x398
[  107.331840]  dump_page+0xc/0x18
[  107.334945]  remove_store+0xbc/0x120
[  107.338479]  dev_attr_store+0x18/0x28
[  107.342103]  sysfs_kf_write+0x40/0x50
[  107.345722]  kernfs_fop_write+0x130/0x1d8
[  107.349687]  __vfs_write+0x30/0x180
[  107.353134]  vfs_write+0xb4/0x1a0
[  107.356410]  ksys_write+0x60/0xd0
[  107.359686]  __arm64_sys_write+0x18/0x20
[  107.363565]  el0_svc_common+0x94/0xf8
[  107.367184]  el0_svc_handler+0x68/0x70
[  107.370890]  el0_svc+0x8/0xc
[  107.373737] Code: f9400401 d1000422 f240003f 9a801040 (f9400402)
[  107.379766] ---[ end trace cdb5eb5bf435cecb ]---
-----

While after this patch, DEBUG_VM works as intended:
-----
[   46.835963] page:ffffffbf20000000 is uninitialized and poisoned
[   46.835970] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff 
ffffffffffffffff
[   46.849520] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff 
ffffffffffffffff
[   46.857194] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
[   46.863170] ------------[ cut here ]------------
[   46.867736] kernel BUG at ./include/linux/mm.h:1006!
[   46.872646] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   46.878071] Modules linked in:
[   46.881092] CPU: 1 PID: 483 Comm: bash Not tainted 5.0.0-rc1+ #3
[   46.887032] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Dec 17 2018
[   46.897704] pstate: 40000005 (nZcv daif -PAN -UAO)
[   46.902449] pc : remove_store+0xbc/0x120
...
-----

Robin.

>> Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page")
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   mm/debug.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/debug.c b/mm/debug.c
>> index 0abb987dad9b..1611cf00a137 100644
>> --- a/mm/debug.c
>> +++ b/mm/debug.c
>> @@ -44,7 +44,7 @@ const struct trace_print_flags vmaflag_names[] = {
>>   
>>   void __dump_page(struct page *page, const char *reason)
>>   {
>> -	struct address_space *mapping = page_mapping(page);
>> +	struct address_space *mapping;
>>   	bool page_poisoned = PagePoisoned(page);
>>   	int mapcount;
>>   
>> @@ -58,6 +58,8 @@ void __dump_page(struct page *page, const char *reason)
>>   		goto hex_only;
>>   	}
>>   
>> +	mapping = page_mapping(page);
>> +
>>   	/*
>>   	 * Avoid VM_BUG_ON() in page_mapcount().
>>   	 * page->_mapcount space in struct page is used by sl[aou]b pages to
>> -- 
>> 2.20.1.dirty
>>
>
Michal Hocko Feb. 13, 2019, 2:54 p.m. UTC | #3
On Wed 13-02-19 14:38:37, Robin Murphy wrote:
> On 13/02/2019 14:23, Michal Hocko wrote:
> > On Wed 13-02-19 13:40:49, Robin Murphy wrote:
> > > Evaluating page_mapping() on a poisoned page ends up dereferencing junk
> > > and making PF_POISONED_CHECK() considerably crashier than intended. Fix
> > > that by not inspecting the mapping until we've determined that it's
> > > likely to be valid.
> > 
> > Has this ever triggered? I am mainly asking because there is no usage of
> > mapping so I would expect that the compiler wouldn't really call
> > page_mapping until it is really used.
> 
> A function call is a sequence point, so any compiler that did that would be
> totally broken.

Ohh, right I thought this is a static inline.

> The crash looks like this (now from an explicit dump_page() call before it
> happens naturally deep within pfn_to_nid()):

please add this to the changelog.

Acked-by: Michal Hocko <mhocko@suse.com>

> -----
> [  107.147056] Unable to handle kernel NULL pointer dereference at virtual
> address 0000000000000006
> [  107.155774] Mem abort info:
> [  107.158546]   ESR = 0x96000005
> [  107.161572]   Exception class = DABT (current EL), IL = 32 bits
> [  107.167437]   SET = 0, FnV = 0
> [  107.170460]   EA = 0, S1PTW = 0
> [  107.173568] Data abort info:
> [  107.176419]   ISV = 0, ISS = 0x00000005
> [  107.180218]   CM = 0, WnR = 0
> [  107.183151] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c2f6ac38
> [  107.189702] [0000000000000006] pgd=0000000000000000, pud=0000000000000000
> [  107.196430] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> [  107.201942] Modules linked in:
> [  107.204962] CPU: 2 PID: 491 Comm: bash Not tainted 5.0.0-rc1+ #1
> [  107.210903] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno
> Development Platform, BIOS EDK II Dec 17 2018
> [  107.221576] pstate: 00000005 (nzcv daif -PAN -UAO)
> [  107.226321] pc : page_mapping+0x18/0x118
> [  107.230200] lr : __dump_page+0x1c/0x398
> [  107.233990] sp : ffffff8011a53c30
> [  107.237265] x29: ffffff8011a53c30 x28: ffffffc039b6ec00
> [  107.242520] x27: 0000000000000000 x26: 0000000000000000
> [  107.247775] x25: 0000000056000000 x24: 0000000000000015
> [  107.253029] x23: ffffff80114d8b18 x22: 0000000000000022
> [  107.258283] x21: ffffffc03538ec38 x20: ffffff8011082e78
> [  107.263537] x19: ffffffbf20000000 x18: 0000000000000000
> [  107.268790] x17: 0000000000000000 x16: 0000000000000000
> [  107.274044] x15: 0000000000000000 x14: 0000000000000000
> [  107.279297] x13: 0000000000000000 x12: 0000000000000030
> [  107.284550] x11: 0000000000000030 x10: 0101010101010101
> [  107.289804] x9 : ff7274615e68726c x8 : 7f7f7f7f7f7f7f7f
> [  107.295057] x7 : feff64756e6c6471 x6 : 0000000000008080
> [  107.300310] x5 : 0000000000000000 x4 : 0000000000000000
> [  107.305564] x3 : ffffffc039b6ec00 x2 : fffffffffffffffe
> [  107.310817] x1 : ffffffffffffffff x0 : fffffffffffffffe
> [  107.316072] Process bash (pid: 491, stack limit = 0x000000004ebd4ecd)
> [  107.322442] Call trace:
> [  107.324858]  page_mapping+0x18/0x118
> [  107.328392]  __dump_page+0x1c/0x398
> [  107.331840]  dump_page+0xc/0x18
> [  107.334945]  remove_store+0xbc/0x120
> [  107.338479]  dev_attr_store+0x18/0x28
> [  107.342103]  sysfs_kf_write+0x40/0x50
> [  107.345722]  kernfs_fop_write+0x130/0x1d8
> [  107.349687]  __vfs_write+0x30/0x180
> [  107.353134]  vfs_write+0xb4/0x1a0
> [  107.356410]  ksys_write+0x60/0xd0
> [  107.359686]  __arm64_sys_write+0x18/0x20
> [  107.363565]  el0_svc_common+0x94/0xf8
> [  107.367184]  el0_svc_handler+0x68/0x70
> [  107.370890]  el0_svc+0x8/0xc
> [  107.373737] Code: f9400401 d1000422 f240003f 9a801040 (f9400402)
> [  107.379766] ---[ end trace cdb5eb5bf435cecb ]---
> -----
> 
> While after this patch, DEBUG_VM works as intended:
> -----
> [   46.835963] page:ffffffbf20000000 is uninitialized and poisoned
> [   46.835970] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff
> ffffffffffffffff
> [   46.849520] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff
> ffffffffffffffff
> [   46.857194] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> [   46.863170] ------------[ cut here ]------------
> [   46.867736] kernel BUG at ./include/linux/mm.h:1006!
> [   46.872646] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [   46.878071] Modules linked in:
> [   46.881092] CPU: 1 PID: 483 Comm: bash Not tainted 5.0.0-rc1+ #3
> [   46.887032] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno
> Development Platform, BIOS EDK II Dec 17 2018
> [   46.897704] pstate: 40000005 (nZcv daif -PAN -UAO)
> [   46.902449] pc : remove_store+0xbc/0x120
> ...
> -----
> 
> Robin.
> 
> > > Fixes: 1c6fb1d89e73 ("mm: print more information about mapping in __dump_page")
> > > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > > ---
> > >   mm/debug.c | 4 +++-
> > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/debug.c b/mm/debug.c
> > > index 0abb987dad9b..1611cf00a137 100644
> > > --- a/mm/debug.c
> > > +++ b/mm/debug.c
> > > @@ -44,7 +44,7 @@ const struct trace_print_flags vmaflag_names[] = {
> > >   void __dump_page(struct page *page, const char *reason)
> > >   {
> > > -	struct address_space *mapping = page_mapping(page);
> > > +	struct address_space *mapping;
> > >   	bool page_poisoned = PagePoisoned(page);
> > >   	int mapcount;
> > > @@ -58,6 +58,8 @@ void __dump_page(struct page *page, const char *reason)
> > >   		goto hex_only;
> > >   	}
> > > +	mapping = page_mapping(page);
> > > +
> > >   	/*
> > >   	 * Avoid VM_BUG_ON() in page_mapcount().
> > >   	 * page->_mapcount space in struct page is used by sl[aou]b pages to
> > > -- 
> > > 2.20.1.dirty
> > > 
> >
diff mbox series

Patch

diff --git a/mm/debug.c b/mm/debug.c
index 0abb987dad9b..1611cf00a137 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -44,7 +44,7 @@  const struct trace_print_flags vmaflag_names[] = {
 
 void __dump_page(struct page *page, const char *reason)
 {
-	struct address_space *mapping = page_mapping(page);
+	struct address_space *mapping;
 	bool page_poisoned = PagePoisoned(page);
 	int mapcount;
 
@@ -58,6 +58,8 @@  void __dump_page(struct page *page, const char *reason)
 		goto hex_only;
 	}
 
+	mapping = page_mapping(page);
+
 	/*
 	 * Avoid VM_BUG_ON() in page_mapcount().
 	 * page->_mapcount space in struct page is used by sl[aou]b pages to