diff mbox

[v11] PM / hibernate: Verify the consistent of e820 memory map by md5 digest

Message ID 20161007163108.GP27959@linux-rxt1.site (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

joeyli Oct. 7, 2016, 4:31 p.m. UTC
Hi Chen Yu,

On Sun, Sep 25, 2016 at 12:17:57PM +0800, Chen Yu wrote:
> On some platforms, there is occasional panic triggered when trying to
> resume from hibernation, a typical panic looks like:
> 
> "BUG: unable to handle kernel paging request at ffff880085894000
> IP: [<ffffffff810c5dc2>] load_image_lzo+0x8c2/0xe70"
> 
> Investigation carried out by Lee Chun-Yi shows that this is because
> e820 map has been changed by BIOS across hibernation, and one
> of the page frames from suspend kernel is right located in restore
> kernel's unmapped region, so panic comes out when accessing unmapped
> kernel address.
>

Sorry for finally I can not find the issue machine back now. So I add
a patch to fool kernel as the e820 changed when S4 resume for testing.
 
> In order to expose this issue earlier, the md5 hash of e820 map
> is passed from suspend kernel to restore kernel, and the restore
> kernel will terminate the resume process once it finds the md5
> hash are not the same.
>
[...snip] 
> ---
>  arch/x86/power/hibernate_64.c | 92 ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 90 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> index 9634557..d81b1af 100644
> --- a/arch/x86/power/hibernate_64.c
> +++ b/arch/x86/power/hibernate_64.c
> @@ -11,6 +11,10 @@
>  #include <linux/gfp.h>
>  #include <linux/smp.h>
>  #include <linux/suspend.h>
> +#include <linux/scatterlist.h>
> +#include <linux/kdebug.h>

[...snip]

> @@ -216,5 +297,12 @@ int arch_hibernation_header_restore(void *addr)
>  	restore_jump_address = rdr->jump_address;
>  	jump_address_phys = rdr->jump_address_phys;
>  	restore_cr3 = rdr->cr3;
> -	return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL;
> +
> +	if (rdr->magic != RESTORE_MAGIC)
> +		return -EINVAL;
> +
> +	if (hibernation_e820_mismatch(rdr->e820_digest))
> +		return -ENODEV;
> +
> +	return 0;
>  }
> --

Because the check_image_kernel() function doesn't check the return error,
kernel only shows "PM: Image mismatch: architecture specific data". The
message covered two different fail reason.
 
I suggest that it prints out a log like the restore function in ARM64
architecture. Something like this, please feel free to modify the
wording:


Other parts in your patch are good to me.


Thanks a lot!
Joey Lee
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Chen Yu Oct. 8, 2016, 5:03 p.m. UTC | #1
Hi Joey,
On Sat, Oct 08, 2016 at 12:31:08AM +0800, joeyli wrote:
> Hi Chen Yu,
> 
> On Sun, Sep 25, 2016 at 12:17:57PM +0800, Chen Yu wrote:
> > On some platforms, there is occasional panic triggered when trying to
> > resume from hibernation, a typical panic looks like:
> > 
> > "BUG: unable to handle kernel paging request at ffff880085894000
> > IP: [<ffffffff810c5dc2>] load_image_lzo+0x8c2/0xe70"
> > 
> > Investigation carried out by Lee Chun-Yi shows that this is because
> > e820 map has been changed by BIOS across hibernation, and one
> > of the page frames from suspend kernel is right located in restore
> > kernel's unmapped region, so panic comes out when accessing unmapped
> > kernel address.
> >
> 
> Sorry for finally I can not find the issue machine back now. So I add
> a patch to fool kernel as the e820 changed when S4 resume for testing.
>  
> > In order to expose this issue earlier, the md5 hash of e820 map
> > is passed from suspend kernel to restore kernel, and the restore
> > kernel will terminate the resume process once it finds the md5
> > hash are not the same.
> >
> [...snip] 
> > ---
> >  arch/x86/power/hibernate_64.c | 92 ++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 90 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> > index 9634557..d81b1af 100644
> > --- a/arch/x86/power/hibernate_64.c
> > +++ b/arch/x86/power/hibernate_64.c
> > @@ -11,6 +11,10 @@
> >  #include <linux/gfp.h>
> >  #include <linux/smp.h>
> >  #include <linux/suspend.h>
> > +#include <linux/scatterlist.h>
> > +#include <linux/kdebug.h>
> 
> [...snip]
> 
> > @@ -216,5 +297,12 @@ int arch_hibernation_header_restore(void *addr)
> >  	restore_jump_address = rdr->jump_address;
> >  	jump_address_phys = rdr->jump_address_phys;
> >  	restore_cr3 = rdr->cr3;
> > -	return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL;
> > +
> > +	if (rdr->magic != RESTORE_MAGIC)
> > +		return -EINVAL;
> > +
> > +	if (hibernation_e820_mismatch(rdr->e820_digest))
> > +		return -ENODEV;
> > +
> > +	return 0;
> >  }
> > --
> 
> Because the check_image_kernel() function doesn't check the return error,
> kernel only shows "PM: Image mismatch: architecture specific data". The
> message covered two different fail reason.
>  
> I suggest that it prints out a log like the restore function in ARM64
> architecture. Something like this, please feel free to modify the
> wording:
> 
> Index: linux/arch/x86/power/hibernate_64.c
> ===================================================================
> --- linux.orig/arch/x86/power/hibernate_64.c
> +++ linux/arch/x86/power/hibernate_64.c
> @@ -298,11 +298,16 @@ int arch_hibernation_header_restore(void
>         jump_address_phys = rdr->jump_address_phys;
>         restore_cr3 = rdr->cr3;
>  
> -       if (rdr->magic != RESTORE_MAGIC)
> +
> +       if (rdr->magic != RESTORE_MAGIC) {
> +               pr_crit("Hibernate image not generated by this kernel!\n");
>                 return -EINVAL;
> +       }
>  
> -       if (hibernation_e820_mismatch(rdr->e820_digest))
> +       if (hibernation_e820_mismatch(rdr->e820_digest)) {
> +               pr_crit("The e820 saved regions changed!\n");
>                 return -ENODEV;
> +       }
>  
>         return 0;
>  }
> 
OK, will refresh it after 4.9-rc1 released due to a e820 modification
recently.

Thanks,
Yu
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux/arch/x86/power/hibernate_64.c
===================================================================
--- linux.orig/arch/x86/power/hibernate_64.c
+++ linux/arch/x86/power/hibernate_64.c
@@ -298,11 +298,16 @@  int arch_hibernation_header_restore(void
        jump_address_phys = rdr->jump_address_phys;
        restore_cr3 = rdr->cr3;
 
-       if (rdr->magic != RESTORE_MAGIC)
+
+       if (rdr->magic != RESTORE_MAGIC) {
+               pr_crit("Hibernate image not generated by this kernel!\n");
                return -EINVAL;
+       }
 
-       if (hibernation_e820_mismatch(rdr->e820_digest))
+       if (hibernation_e820_mismatch(rdr->e820_digest)) {
+               pr_crit("The e820 saved regions changed!\n");
                return -ENODEV;
+       }
 
        return 0;
 }