diff mbox

[Resend] x86/power/64: Always create temporary identity mapping correctly

Message ID 2206547.eDj3RJQyE5@vostro.rjw.lan (mailing list archive)
State New, archived
Headers show

Commit Message

Rafael J. Wysocki Aug. 8, 2016, 1:31 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The low-level resume-from-hibernation code on x86-64 uses
kernel_ident_mapping_init() to create the temoprary identity mapping,
but that function assumes that the offset between kernel virtual
addresses and physical addresses is aligned on the PGD level.

However, with a randomized identity mapping base, it may be aligned
on the PUD level and if that happens, the temporary identity mapping
created by set_up_temporary_mappings() will not reflect the actual
kernel identity mapping and the image restoration will fail as a
result (leading to a kernel panic most of the time).

To fix this problem, rework kernel_ident_mapping_init() to support
unaligned offsets between KVA and PA up to the PMD level and make
set_up_temporary_mappings() use it as approprtiate.

Reported-by: Thomas Garnier <thgarnie@google.com>
Suggested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
---

This is sort of urgent, because hibernation doesn't work with KASLR on x86-64
in 4.8-rc1 AFAICS and this should make them work together again.

Unless anyone sees any problems with it, I'll queue it up for 4.8-rc2.

Thomas, would it be possible to test it with KASLR enabled, please?

Thanks,
Rafael

---
 arch/x86/include/asm/init.h   |    4 ++--
 arch/x86/mm/ident_map.c       |   19 +++++++++++--------
 arch/x86/power/hibernate_64.c |    2 +-
 3 files changed, 14 insertions(+), 11 deletions(-)

Comments

Borislav Petkov Aug. 8, 2016, 1:40 p.m. UTC | #1
On Mon, Aug 08, 2016 at 03:31:31PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The low-level resume-from-hibernation code on x86-64 uses
> kernel_ident_mapping_init() to create the temoprary identity mapping,
> but that function assumes that the offset between kernel virtual
> addresses and physical addresses is aligned on the PGD level.
> 
> However, with a randomized identity mapping base, it may be aligned
> on the PUD level and if that happens, the temporary identity mapping
> created by set_up_temporary_mappings() will not reflect the actual
> kernel identity mapping and the image restoration will fail as a
> result (leading to a kernel panic most of the time).
> 
> To fix this problem, rework kernel_ident_mapping_init() to support
> unaligned offsets between KVA and PA up to the PMD level and make
> set_up_temporary_mappings() use it as approprtiate.
> 
> Reported-by: Thomas Garnier <thgarnie@google.com>

Reported-by: Borislav Petkov <bp@suse.de>

> Suggested-by: Yinghai Lu <yinghai@kernel.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Acked-by: Yinghai Lu <yinghai@kernel.org>
> ---
> 
> This is sort of urgent, because hibernation doesn't work with KASLR on x86-64
> in 4.8-rc1 AFAICS and this should make them work together again.
> 
> Unless anyone sees any problems with it, I'll queue it up for 4.8-rc2.
> 
> Thomas, would it be possible to test it with KASLR enabled, please?

Is that the only patch which needs to be tested? Ontop of which tree?

CONFIG_RANDOMIZE_MEMORY blew up s2d on my laptop here so I'll run it
once I have the required info from you :)

Thanks.
Rafael J. Wysocki Aug. 8, 2016, 1:54 p.m. UTC | #2
On Mon, Aug 8, 2016 at 3:40 PM, Borislav Petkov <bp@suse.de> wrote:
> On Mon, Aug 08, 2016 at 03:31:31PM +0200, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> The low-level resume-from-hibernation code on x86-64 uses
>> kernel_ident_mapping_init() to create the temoprary identity mapping,
>> but that function assumes that the offset between kernel virtual
>> addresses and physical addresses is aligned on the PGD level.
>>
>> However, with a randomized identity mapping base, it may be aligned
>> on the PUD level and if that happens, the temporary identity mapping
>> created by set_up_temporary_mappings() will not reflect the actual
>> kernel identity mapping and the image restoration will fail as a
>> result (leading to a kernel panic most of the time).
>>
>> To fix this problem, rework kernel_ident_mapping_init() to support
>> unaligned offsets between KVA and PA up to the PMD level and make
>> set_up_temporary_mappings() use it as approprtiate.
>>
>> Reported-by: Thomas Garnier <thgarnie@google.com>
>
> Reported-by: Borislav Petkov <bp@suse.de>
>
>> Suggested-by: Yinghai Lu <yinghai@kernel.org>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Acked-by: Yinghai Lu <yinghai@kernel.org>
>> ---
>>
>> This is sort of urgent, because hibernation doesn't work with KASLR on x86-64
>> in 4.8-rc1 AFAICS and this should make them work together again.
>>
>> Unless anyone sees any problems with it, I'll queue it up for 4.8-rc2.
>>
>> Thomas, would it be possible to test it with KASLR enabled, please?
>
> Is that the only patch which needs to be tested? Ontop of which tree?

That should be the only one on top of plain 4.8-rc1.

If it doesn't help, we need more work to do. :-)

> CONFIG_RANDOMIZE_MEMORY blew up s2d on my laptop here so I'll run it
> once I have the required info from you :)

Thanks!

Best,
Rafael
Thomas Garnier Aug. 8, 2016, 6 p.m. UTC | #3
On Mon, Aug 8, 2016 at 6:54 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Mon, Aug 8, 2016 at 3:40 PM, Borislav Petkov <bp@suse.de> wrote:
>> On Mon, Aug 08, 2016 at 03:31:31PM +0200, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> The low-level resume-from-hibernation code on x86-64 uses
>>> kernel_ident_mapping_init() to create the temoprary identity mapping,
>>> but that function assumes that the offset between kernel virtual
>>> addresses and physical addresses is aligned on the PGD level.
>>>
>>> However, with a randomized identity mapping base, it may be aligned
>>> on the PUD level and if that happens, the temporary identity mapping
>>> created by set_up_temporary_mappings() will not reflect the actual
>>> kernel identity mapping and the image restoration will fail as a
>>> result (leading to a kernel panic most of the time).
>>>
>>> To fix this problem, rework kernel_ident_mapping_init() to support
>>> unaligned offsets between KVA and PA up to the PMD level and make
>>> set_up_temporary_mappings() use it as approprtiate.
>>>
>>> Reported-by: Thomas Garnier <thgarnie@google.com>
>>
>> Reported-by: Borislav Petkov <bp@suse.de>
>>
>>> Suggested-by: Yinghai Lu <yinghai@kernel.org>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> Acked-by: Yinghai Lu <yinghai@kernel.org>
>>> ---
>>>
>>> This is sort of urgent, because hibernation doesn't work with KASLR on x86-64
>>> in 4.8-rc1 AFAICS and this should make them work together again.
>>>
>>> Unless anyone sees any problems with it, I'll queue it up for 4.8-rc2.
>>>
>>> Thomas, would it be possible to test it with KASLR enabled, please?
>>

I tested it on my setup couple times. Worked well.

>> Is that the only patch which needs to be tested? Ontop of which tree?
>
> That should be the only one on top of plain 4.8-rc1.
>
> If it doesn't help, we need more work to do. :-)
>
>> CONFIG_RANDOMIZE_MEMORY blew up s2d on my laptop here so I'll run it
>> once I have the required info from you :)
>
> Thanks!
>
> Best,
> Rafael
Rafael J. Wysocki Aug. 8, 2016, 8:01 p.m. UTC | #4
On Mon, Aug 8, 2016 at 8:00 PM, Thomas Garnier <thgarnie@google.com> wrote:
> On Mon, Aug 8, 2016 at 6:54 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Mon, Aug 8, 2016 at 3:40 PM, Borislav Petkov <bp@suse.de> wrote:
>>> On Mon, Aug 08, 2016 at 03:31:31PM +0200, Rafael J. Wysocki wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> The low-level resume-from-hibernation code on x86-64 uses
>>>> kernel_ident_mapping_init() to create the temoprary identity mapping,
>>>> but that function assumes that the offset between kernel virtual
>>>> addresses and physical addresses is aligned on the PGD level.
>>>>
>>>> However, with a randomized identity mapping base, it may be aligned
>>>> on the PUD level and if that happens, the temporary identity mapping
>>>> created by set_up_temporary_mappings() will not reflect the actual
>>>> kernel identity mapping and the image restoration will fail as a
>>>> result (leading to a kernel panic most of the time).
>>>>
>>>> To fix this problem, rework kernel_ident_mapping_init() to support
>>>> unaligned offsets between KVA and PA up to the PMD level and make
>>>> set_up_temporary_mappings() use it as approprtiate.
>>>>
>>>> Reported-by: Thomas Garnier <thgarnie@google.com>
>>>
>>> Reported-by: Borislav Petkov <bp@suse.de>
>>>
>>>> Suggested-by: Yinghai Lu <yinghai@kernel.org>
>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> Acked-by: Yinghai Lu <yinghai@kernel.org>
>>>> ---
>>>>
>>>> This is sort of urgent, because hibernation doesn't work with KASLR on x86-64
>>>> in 4.8-rc1 AFAICS and this should make them work together again.
>>>>
>>>> Unless anyone sees any problems with it, I'll queue it up for 4.8-rc2.
>>>>
>>>> Thomas, would it be possible to test it with KASLR enabled, please?
>>>
>
> I tested it on my setup couple times. Worked well.

Thanks!
Borislav Petkov Aug. 9, 2016, 7:02 a.m. UTC | #5
On Mon, Aug 08, 2016 at 03:54:48PM +0200, Rafael J. Wysocki wrote:
> That should be the only one on top of plain 4.8-rc1.
> 
> If it doesn't help, we need more work to do. :-)

Yes, we do.

The machine triple-faults *after* reading up the hibernation image.
It hits 100%, then tries to switch to the boot kernel and BOOM, BIOS
screen.

I'm attaching the .config in case you want to reproduce it. The machine
is an IVB thinkpad x230.

Thanks.
Jiri Kosina Aug. 9, 2016, 9:23 a.m. UTC | #6
On Mon, 8 Aug 2016, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The low-level resume-from-hibernation code on x86-64 uses
> kernel_ident_mapping_init() to create the temoprary identity mapping,
> but that function assumes that the offset between kernel virtual
> addresses and physical addresses is aligned on the PGD level.
> 
> However, with a randomized identity mapping base, it may be aligned
> on the PUD level and if that happens, the temporary identity mapping
> created by set_up_temporary_mappings() will not reflect the actual
> kernel identity mapping and the image restoration will fail as a
> result (leading to a kernel panic most of the time).
> 
> To fix this problem, rework kernel_ident_mapping_init() to support
> unaligned offsets between KVA and PA up to the PMD level and make
> set_up_temporary_mappings() use it as approprtiate.
> 
> Reported-by: Thomas Garnier <thgarnie@google.com>
> Suggested-by: Yinghai Lu <yinghai@kernel.org>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Acked-by: Yinghai Lu <yinghai@kernel.org>
> ---
> 
> This is sort of urgent, because hibernation doesn't work with KASLR on x86-64
> in 4.8-rc1 AFAICS and this should make them work together again.
> 
> Unless anyone sees any problems with it, I'll queue it up for 4.8-rc2.
> 
> Thomas, would it be possible to test it with KASLR enabled, please?

Unfortunately this applied on top of -rc1 still doesn't solve the reboot 
after reading hibernation image (I'd guess due to triple fault) with 
CONFIG_RANDOMIZE_MEMORY=y on my system.

With CONFIG_RANDOMIZE_MEMORY=n, the system resumes correctly.
Rafael J. Wysocki Aug. 9, 2016, 11:47 a.m. UTC | #7
On Tue, Aug 9, 2016 at 9:02 AM, Borislav Petkov <bp@suse.de> wrote:
> On Mon, Aug 08, 2016 at 03:54:48PM +0200, Rafael J. Wysocki wrote:
>> That should be the only one on top of plain 4.8-rc1.
>>
>> If it doesn't help, we need more work to do. :-)
>
> Yes, we do.
>
> The machine triple-faults *after* reading up the hibernation image.
> It hits 100%, then tries to switch to the boot kernel and BOOM, BIOS
> screen.
>
> I'm attaching the .config in case you want to reproduce it. The machine
> is an IVB thinkpad x230.

Yes, I'm going to try to reproduce it.

I'm wondering what the difference between your .config and the Thomas'
.config is, as he has CONFIG_RANDOMIZE_MEMORY=y set too.

Thomas, can you attach your .config, please?

Thanks,
Rafael
Rafael J. Wysocki Aug. 9, 2016, 11:56 a.m. UTC | #8
On Tue, Aug 9, 2016 at 11:23 AM, Jiri Kosina <jikos@kernel.org> wrote:
> On Mon, 8 Aug 2016, Rafael J. Wysocki wrote:
>
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> The low-level resume-from-hibernation code on x86-64 uses
>> kernel_ident_mapping_init() to create the temoprary identity mapping,
>> but that function assumes that the offset between kernel virtual
>> addresses and physical addresses is aligned on the PGD level.
>>
>> However, with a randomized identity mapping base, it may be aligned
>> on the PUD level and if that happens, the temporary identity mapping
>> created by set_up_temporary_mappings() will not reflect the actual
>> kernel identity mapping and the image restoration will fail as a
>> result (leading to a kernel panic most of the time).
>>
>> To fix this problem, rework kernel_ident_mapping_init() to support
>> unaligned offsets between KVA and PA up to the PMD level and make
>> set_up_temporary_mappings() use it as approprtiate.
>>
>> Reported-by: Thomas Garnier <thgarnie@google.com>
>> Suggested-by: Yinghai Lu <yinghai@kernel.org>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Acked-by: Yinghai Lu <yinghai@kernel.org>
>> ---
>>
>> This is sort of urgent, because hibernation doesn't work with KASLR on x86-64
>> in 4.8-rc1 AFAICS and this should make them work together again.
>>
>> Unless anyone sees any problems with it, I'll queue it up for 4.8-rc2.
>>
>> Thomas, would it be possible to test it with KASLR enabled, please?
>
> Unfortunately this applied on top of -rc1 still doesn't solve the reboot
> after reading hibernation image (I'd guess due to triple fault) with
> CONFIG_RANDOMIZE_MEMORY=y on my system.
>
> With CONFIG_RANDOMIZE_MEMORY=n, the system resumes correctly.

Here's a list of commits from Thomas that are related to memory randomization.

210e7a43fa90 mm: SLUB freelist randomization
7c00fce98c3e mm: reorganize SLAB freelist randomization
4ff5308744f5 x86/mm: Do not reference phys addr beyond kernel
90397a417796 x86/mm: Add memory hotplug support for KASLR memory randomization
a95ae27c2ee1 x86/mm: Enable KASLR for vmalloc memory regions
021182e52fe0 x86/mm: Enable KASLR for physical mapping memory regions
0483e1fa6e09 x86/mm: Implement ASLR for kernel memory regions
b234e8a09003 x86/mm: Separate variable for trampoline PGD
faa379332f3c x86/mm: Add PUD VA support for physical mapping
59b3d0206d74 x86/mm: Update physical mapping variable names
d899a7d146a2 x86/mm: Refactor KASLR entropy functions

I wonder if it is viable to revert them one by one top-to-bottom and
see which one of them causes things to fail?

Thanks,
Rafael
Jiri Kosina Aug. 9, 2016, 12:58 p.m. UTC | #9
On Tue, 9 Aug 2016, Rafael J. Wysocki wrote:

> Here's a list of commits from Thomas that are related to memory randomization.
> 
> 210e7a43fa90 mm: SLUB freelist randomization
> 7c00fce98c3e mm: reorganize SLAB freelist randomization
> 4ff5308744f5 x86/mm: Do not reference phys addr beyond kernel
> 90397a417796 x86/mm: Add memory hotplug support for KASLR memory randomization
> a95ae27c2ee1 x86/mm: Enable KASLR for vmalloc memory regions
> 021182e52fe0 x86/mm: Enable KASLR for physical mapping memory regions

Okay, I did one-by-one reverts, and the one above, i.e.

	commit 021182e52fe01c1f7b126f97fd6ba048dc4234fd
	Author: Thomas Garnier <thgarnie@google.com>
	Date:   Tue Jun 21 17:47:03 2016 -0700

	    x86/mm: Enable KASLR for physical mapping memory regions

is the one that is the culprit on my machine. With this one reverted, 
resume hibernation doesn't reboot (tripple fault?), but proceeds 
succesfully.
Jiri Kosina Aug. 9, 2016, 1:30 p.m. UTC | #10
On Tue, 9 Aug 2016, Jiri Kosina wrote:

> > 210e7a43fa90 mm: SLUB freelist randomization
> > 7c00fce98c3e mm: reorganize SLAB freelist randomization
> > 4ff5308744f5 x86/mm: Do not reference phys addr beyond kernel
> > 90397a417796 x86/mm: Add memory hotplug support for KASLR memory randomization
> > a95ae27c2ee1 x86/mm: Enable KASLR for vmalloc memory regions
> > 021182e52fe0 x86/mm: Enable KASLR for physical mapping memory regions
> 
> Okay, I did one-by-one reverts, and the one above, i.e.
> 
> 	commit 021182e52fe01c1f7b126f97fd6ba048dc4234fd
> 	Author: Thomas Garnier <thgarnie@google.com>
> 	Date:   Tue Jun 21 17:47:03 2016 -0700
> 
> 	    x86/mm: Enable KASLR for physical mapping memory regions
> 
> is the one that is the culprit on my machine. With this one reverted, 
> resume hibernation doesn't reboot (tripple fault?), but proceeds 
> succesfully.

As discussed with Rafael privately, I also tried this very patch 
(x86/power/64: Always create temporary identity mapping correctly) on top 
of the reverted revert of 021182e52fe01c1f7b1 (see the full log below), 
but such kernel triple faults on resume as well.

87c38d2 x86/power/64: Always create temporary identity mapping correctly
3cb504a Revert "Revert "x86/mm: Enable KASLR for physical mapping memory regions""
758850d Revert "x86/mm: Enable KASLR for physical mapping memory regions"
4a02dfb Revert "x86/mm: Enable KASLR for vmalloc memory regions"
037863f Revert "x86/mm: Add memory hotplug support for KASLR memory randomization"
3416a21 Revert "x86/mm: Do not reference phys addr beyond kernel"
69227be Revert "mm: reorganize SLAB freelist randomization"
a1d8d71 Revert "mm: SLUB freelist randomization"

IOW, 021182e52f introduces a bug for which there is no existing fix yet.
Thomas Garnier Aug. 9, 2016, 3 p.m. UTC | #11
On Tue, Aug 9, 2016 at 6:30 AM, Jiri Kosina <jikos@kernel.org> wrote:
> On Tue, 9 Aug 2016, Jiri Kosina wrote:
>
>> > 210e7a43fa90 mm: SLUB freelist randomization
>> > 7c00fce98c3e mm: reorganize SLAB freelist randomization
>> > 4ff5308744f5 x86/mm: Do not reference phys addr beyond kernel
>> > 90397a417796 x86/mm: Add memory hotplug support for KASLR memory randomization
>> > a95ae27c2ee1 x86/mm: Enable KASLR for vmalloc memory regions
>> > 021182e52fe0 x86/mm: Enable KASLR for physical mapping memory regions
>>
>> Okay, I did one-by-one reverts, and the one above, i.e.
>>
>>       commit 021182e52fe01c1f7b126f97fd6ba048dc4234fd
>>       Author: Thomas Garnier <thgarnie@google.com>
>>       Date:   Tue Jun 21 17:47:03 2016 -0700
>>
>>           x86/mm: Enable KASLR for physical mapping memory regions
>>
>> is the one that is the culprit on my machine. With this one reverted,
>> resume hibernation doesn't reboot (tripple fault?), but proceeds
>> succesfully.

My .config is attached. It is basically defconfig (x86_64) + kvmconfig
plus the following:

CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_RANDOMIZE_BASE=y
CONFIG_X86_NEED_RELOCS=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
CONFIG_X86_PTDUMP_CORE=y
CONFIG_X86_PTDUMP=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_KGDB=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_PRINTK_DBGP=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_DWARF4=y

>
> As discussed with Rafael privately, I also tried this very patch
> (x86/power/64: Always create temporary identity mapping correctly) on top
> of the reverted revert of 021182e52fe01c1f7b1 (see the full log below),
> but such kernel triple faults on resume as well.
>
> 87c38d2 x86/power/64: Always create temporary identity mapping correctly
> 3cb504a Revert "Revert "x86/mm: Enable KASLR for physical mapping memory regions""
> 758850d Revert "x86/mm: Enable KASLR for physical mapping memory regions"
> 4a02dfb Revert "x86/mm: Enable KASLR for vmalloc memory regions"
> 037863f Revert "x86/mm: Add memory hotplug support for KASLR memory randomization"
> 3416a21 Revert "x86/mm: Do not reference phys addr beyond kernel"
> 69227be Revert "mm: reorganize SLAB freelist randomization"
> a1d8d71 Revert "mm: SLUB freelist randomization"
>
> IOW, 021182e52f introduces a bug for which there is no existing fix yet.

You mean it is something different from the previous KASLR bugs we saw?

>
> --
> Jiri Kosina
> SUSE Labs
>
Jiri Kosina Aug. 9, 2016, 3:05 p.m. UTC | #12
On Tue, 9 Aug 2016, Thomas Garnier wrote:

> >> Okay, I did one-by-one reverts, and the one above, i.e.
> >>
> >>       commit 021182e52fe01c1f7b126f97fd6ba048dc4234fd
> >>       Author: Thomas Garnier <thgarnie@google.com>
> >>       Date:   Tue Jun 21 17:47:03 2016 -0700
> >>
> >>           x86/mm: Enable KASLR for physical mapping memory regions
> >>
> >> is the one that is the culprit on my machine. With this one reverted,
> >> resume hibernation doesn't reboot (tripple fault?), but proceeds
> >> succesfully.
> 
> My .config is attached. It is basically defconfig (x86_64) + kvmconfig
> plus the following:
> 
> CONFIG_PHYSICAL_START=0x1000000
> CONFIG_RELOCATABLE=y
> CONFIG_RANDOMIZE_BASE=y
> CONFIG_X86_NEED_RELOCS=y
> CONFIG_PHYSICAL_ALIGN=0x1000000
> CONFIG_RANDOMIZE_MEMORY=y
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
> CONFIG_X86_PTDUMP_CORE=y
> CONFIG_X86_PTDUMP=y
> CONFIG_KALLSYMS=y
> CONFIG_KALLSYMS_ALL=y
> CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
> CONFIG_KALLSYMS_BASE_RELATIVE=y
> CONFIG_PANIC_ON_OOPS=y
> CONFIG_KGDB=y
> CONFIG_EARLY_PRINTK=y
> CONFIG_EARLY_PRINTK_DBGP=y
> CONFIG_DEBUG_INFO=y
> CONFIG_DEBUG_INFO_DWARF4=y

The config I am reproducing the bug with (on thinkpad x200s) can be found 
at

	http://www.jikos.cz/jikos/junk/.config

Either later today or tomorrow I could test with the same physical start 
and align values you're using to see whether that'd make any difference.

> > As discussed with Rafael privately, I also tried this very patch
> > (x86/power/64: Always create temporary identity mapping correctly) on top
> > of the reverted revert of 021182e52fe01c1f7b1 (see the full log below),
> > but such kernel triple faults on resume as well.
> >
> > 87c38d2 x86/power/64: Always create temporary identity mapping correctly
> > 3cb504a Revert "Revert "x86/mm: Enable KASLR for physical mapping memory regions""
> > 758850d Revert "x86/mm: Enable KASLR for physical mapping memory regions"
> > 4a02dfb Revert "x86/mm: Enable KASLR for vmalloc memory regions"
> > 037863f Revert "x86/mm: Add memory hotplug support for KASLR memory randomization"
> > 3416a21 Revert "x86/mm: Do not reference phys addr beyond kernel"
> > 69227be Revert "mm: reorganize SLAB freelist randomization"
> > a1d8d71 Revert "mm: SLUB freelist randomization"
> >
> > IOW, 021182e52f introduces a bug for which there is no existing fix yet.
> 
> You mean it is something different from the previous KASLR bugs we saw?

No, I just wanted to explicitly point out that "x86/power/64: Always 
create temporary identity mapping correctly" is not a fix for this issue.
Rafael J. Wysocki Aug. 9, 2016, 4:18 p.m. UTC | #13
On Tue, Aug 9, 2016 at 5:05 PM, Jiri Kosina <jikos@kernel.org> wrote:
> On Tue, 9 Aug 2016, Thomas Garnier wrote:
>
>> >> Okay, I did one-by-one reverts, and the one above, i.e.
>> >>
>> >>       commit 021182e52fe01c1f7b126f97fd6ba048dc4234fd
>> >>       Author: Thomas Garnier <thgarnie@google.com>
>> >>       Date:   Tue Jun 21 17:47:03 2016 -0700
>> >>
>> >>           x86/mm: Enable KASLR for physical mapping memory regions
>> >>
>> >> is the one that is the culprit on my machine. With this one reverted,
>> >> resume hibernation doesn't reboot (tripple fault?), but proceeds
>> >> succesfully.
>>
>> My .config is attached. It is basically defconfig (x86_64) + kvmconfig
>> plus the following:
>>
>> CONFIG_PHYSICAL_START=0x1000000
>> CONFIG_RELOCATABLE=y
>> CONFIG_RANDOMIZE_BASE=y
>> CONFIG_X86_NEED_RELOCS=y
>> CONFIG_PHYSICAL_ALIGN=0x1000000
>> CONFIG_RANDOMIZE_MEMORY=y
>> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
>> CONFIG_X86_PTDUMP_CORE=y
>> CONFIG_X86_PTDUMP=y
>> CONFIG_KALLSYMS=y
>> CONFIG_KALLSYMS_ALL=y
>> CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
>> CONFIG_KALLSYMS_BASE_RELATIVE=y
>> CONFIG_PANIC_ON_OOPS=y
>> CONFIG_KGDB=y
>> CONFIG_EARLY_PRINTK=y
>> CONFIG_EARLY_PRINTK_DBGP=y
>> CONFIG_DEBUG_INFO=y
>> CONFIG_DEBUG_INFO_DWARF4=y
>
> The config I am reproducing the bug with (on thinkpad x200s) can be found
> at
>
>         http://www.jikos.cz/jikos/junk/.config
>
> Either later today or tomorrow I could test with the same physical start
> and align values you're using to see whether that'd make any difference.
>
>> > As discussed with Rafael privately, I also tried this very patch
>> > (x86/power/64: Always create temporary identity mapping correctly) on top
>> > of the reverted revert of 021182e52fe01c1f7b1 (see the full log below),
>> > but such kernel triple faults on resume as well.
>> >
>> > 87c38d2 x86/power/64: Always create temporary identity mapping correctly
>> > 3cb504a Revert "Revert "x86/mm: Enable KASLR for physical mapping memory regions""
>> > 758850d Revert "x86/mm: Enable KASLR for physical mapping memory regions"
>> > 4a02dfb Revert "x86/mm: Enable KASLR for vmalloc memory regions"
>> > 037863f Revert "x86/mm: Add memory hotplug support for KASLR memory randomization"
>> > 3416a21 Revert "x86/mm: Do not reference phys addr beyond kernel"
>> > 69227be Revert "mm: reorganize SLAB freelist randomization"
>> > a1d8d71 Revert "mm: SLUB freelist randomization"
>> >
>> > IOW, 021182e52f introduces a bug for which there is no existing fix yet.
>>
>> You mean it is something different from the previous KASLR bugs we saw?
>
> No, I just wanted to explicitly point out that "x86/power/64: Always
> create temporary identity mapping correctly" is not a fix for this issue.

It is better to say that the $subject patch is not sufficient to fix
it, because I'm quite confident that it is necessary for that. :-)

Without the $subject patch kernel_ident_mapping_init() makes
assumptions that simply are not met in the randomized identity mapping
base case.  Moreover, hibernation works for Thomas with $subject patch
applied, but it doesn't without it.

So there is something else that we are missing.

I have a murky suspicion, but it is really weird.  Namely, what if
restore_jump_address in set_up_temporary_text_mapping() happens to be
covered by the restore kernel's identity mapping?  Then, the image
kernel's entry point may get overwritten by something else in
core_restore_code().

But is this possible even?  Thomas?

Anyway, I'll try to reproduce this issue later today.

Thanks,
Rafael
Thomas Garnier Aug. 9, 2016, 4:27 p.m. UTC | #14
On Tue, Aug 9, 2016 at 9:18 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Tue, Aug 9, 2016 at 5:05 PM, Jiri Kosina <jikos@kernel.org> wrote:
>> On Tue, 9 Aug 2016, Thomas Garnier wrote:
>>
>>> >> Okay, I did one-by-one reverts, and the one above, i.e.
>>> >>
>>> >>       commit 021182e52fe01c1f7b126f97fd6ba048dc4234fd
>>> >>       Author: Thomas Garnier <thgarnie@google.com>
>>> >>       Date:   Tue Jun 21 17:47:03 2016 -0700
>>> >>
>>> >>           x86/mm: Enable KASLR for physical mapping memory regions
>>> >>
>>> >> is the one that is the culprit on my machine. With this one reverted,
>>> >> resume hibernation doesn't reboot (tripple fault?), but proceeds
>>> >> succesfully.
>>>
>>> My .config is attached. It is basically defconfig (x86_64) + kvmconfig
>>> plus the following:
>>>
>>> CONFIG_PHYSICAL_START=0x1000000
>>> CONFIG_RELOCATABLE=y
>>> CONFIG_RANDOMIZE_BASE=y
>>> CONFIG_X86_NEED_RELOCS=y
>>> CONFIG_PHYSICAL_ALIGN=0x1000000
>>> CONFIG_RANDOMIZE_MEMORY=y
>>> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
>>> CONFIG_X86_PTDUMP_CORE=y
>>> CONFIG_X86_PTDUMP=y
>>> CONFIG_KALLSYMS=y
>>> CONFIG_KALLSYMS_ALL=y
>>> CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
>>> CONFIG_KALLSYMS_BASE_RELATIVE=y
>>> CONFIG_PANIC_ON_OOPS=y
>>> CONFIG_KGDB=y
>>> CONFIG_EARLY_PRINTK=y
>>> CONFIG_EARLY_PRINTK_DBGP=y
>>> CONFIG_DEBUG_INFO=y
>>> CONFIG_DEBUG_INFO_DWARF4=y
>>
>> The config I am reproducing the bug with (on thinkpad x200s) can be found
>> at
>>
>>         http://www.jikos.cz/jikos/junk/.config
>>
>> Either later today or tomorrow I could test with the same physical start
>> and align values you're using to see whether that'd make any difference.
>>
>>> > As discussed with Rafael privately, I also tried this very patch
>>> > (x86/power/64: Always create temporary identity mapping correctly) on top
>>> > of the reverted revert of 021182e52fe01c1f7b1 (see the full log below),
>>> > but such kernel triple faults on resume as well.
>>> >
>>> > 87c38d2 x86/power/64: Always create temporary identity mapping correctly
>>> > 3cb504a Revert "Revert "x86/mm: Enable KASLR for physical mapping memory regions""
>>> > 758850d Revert "x86/mm: Enable KASLR for physical mapping memory regions"
>>> > 4a02dfb Revert "x86/mm: Enable KASLR for vmalloc memory regions"
>>> > 037863f Revert "x86/mm: Add memory hotplug support for KASLR memory randomization"
>>> > 3416a21 Revert "x86/mm: Do not reference phys addr beyond kernel"
>>> > 69227be Revert "mm: reorganize SLAB freelist randomization"
>>> > a1d8d71 Revert "mm: SLUB freelist randomization"
>>> >
>>> > IOW, 021182e52f introduces a bug for which there is no existing fix yet.
>>>
>>> You mean it is something different from the previous KASLR bugs we saw?
>>
>> No, I just wanted to explicitly point out that "x86/power/64: Always
>> create temporary identity mapping correctly" is not a fix for this issue.
>
> It is better to say that the $subject patch is not sufficient to fix
> it, because I'm quite confident that it is necessary for that. :-)
>
> Without the $subject patch kernel_ident_mapping_init() makes
> assumptions that simply are not met in the randomized identity mapping
> base case.  Moreover, hibernation works for Thomas with $subject patch
> applied, but it doesn't without it.
>
> So there is something else that we are missing.
>
> I have a murky suspicion, but it is really weird.  Namely, what if
> restore_jump_address in set_up_temporary_text_mapping() happens to be
> covered by the restore kernel's identity mapping?  Then, the image
> kernel's entry point may get overwritten by something else in
> core_restore_code().
>
> But is this possible even?  Thomas?

I had a similar theory before when I was investigating the original
crash. How is it avoided even without KASLR?

Given the space for the physical memory mapping, I doubt this issue
would happen all the time though.

>
> Anyway, I'll try to reproduce this issue later today.
>
> Thanks,
> Rafael
Jiri Kosina Aug. 9, 2016, 8:02 p.m. UTC | #15
On Tue, 9 Aug 2016, Rafael J. Wysocki wrote:

> I have a murky suspicion, but it is really weird.  Namely, what if
> restore_jump_address in set_up_temporary_text_mapping() happens to be
> covered by the restore kernel's identity mapping?  Then, the image
> kernel's entry point may get overwritten by something else in
> core_restore_code().

So this made me to actually test a scenario where I'd suspend a kernel 
that's known-broken (i.e. contains 021182e52fe), and then have it resumed 
by a kernel that has 021182e52fe reverted. It resumed successfully.

Just a datapoint.
Rafael J. Wysocki Aug. 9, 2016, 8:50 p.m. UTC | #16
On Tue, Aug 9, 2016 at 6:27 PM, Thomas Garnier <thgarnie@google.com> wrote:
> On Tue, Aug 9, 2016 at 9:18 AM, Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Tue, Aug 9, 2016 at 5:05 PM, Jiri Kosina <jikos@kernel.org> wrote:
>>> On Tue, 9 Aug 2016, Thomas Garnier wrote:
>>>
>>>> >> Okay, I did one-by-one reverts, and the one above, i.e.
>>>> >>
>>>> >>       commit 021182e52fe01c1f7b126f97fd6ba048dc4234fd
>>>> >>       Author: Thomas Garnier <thgarnie@google.com>
>>>> >>       Date:   Tue Jun 21 17:47:03 2016 -0700
>>>> >>
>>>> >>           x86/mm: Enable KASLR for physical mapping memory regions
>>>> >>
>>>> >> is the one that is the culprit on my machine. With this one reverted,
>>>> >> resume hibernation doesn't reboot (tripple fault?), but proceeds
>>>> >> succesfully.
>>>>
>>>> My .config is attached. It is basically defconfig (x86_64) + kvmconfig
>>>> plus the following:
>>>>
>>>> CONFIG_PHYSICAL_START=0x1000000
>>>> CONFIG_RELOCATABLE=y
>>>> CONFIG_RANDOMIZE_BASE=y
>>>> CONFIG_X86_NEED_RELOCS=y
>>>> CONFIG_PHYSICAL_ALIGN=0x1000000
>>>> CONFIG_RANDOMIZE_MEMORY=y
>>>> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
>>>> CONFIG_X86_PTDUMP_CORE=y
>>>> CONFIG_X86_PTDUMP=y
>>>> CONFIG_KALLSYMS=y
>>>> CONFIG_KALLSYMS_ALL=y
>>>> CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
>>>> CONFIG_KALLSYMS_BASE_RELATIVE=y
>>>> CONFIG_PANIC_ON_OOPS=y
>>>> CONFIG_KGDB=y
>>>> CONFIG_EARLY_PRINTK=y
>>>> CONFIG_EARLY_PRINTK_DBGP=y
>>>> CONFIG_DEBUG_INFO=y
>>>> CONFIG_DEBUG_INFO_DWARF4=y
>>>
>>> The config I am reproducing the bug with (on thinkpad x200s) can be found
>>> at
>>>
>>>         http://www.jikos.cz/jikos/junk/.config
>>>
>>> Either later today or tomorrow I could test with the same physical start
>>> and align values you're using to see whether that'd make any difference.
>>>
>>>> > As discussed with Rafael privately, I also tried this very patch
>>>> > (x86/power/64: Always create temporary identity mapping correctly) on top
>>>> > of the reverted revert of 021182e52fe01c1f7b1 (see the full log below),
>>>> > but such kernel triple faults on resume as well.
>>>> >
>>>> > 87c38d2 x86/power/64: Always create temporary identity mapping correctly
>>>> > 3cb504a Revert "Revert "x86/mm: Enable KASLR for physical mapping memory regions""
>>>> > 758850d Revert "x86/mm: Enable KASLR for physical mapping memory regions"
>>>> > 4a02dfb Revert "x86/mm: Enable KASLR for vmalloc memory regions"
>>>> > 037863f Revert "x86/mm: Add memory hotplug support for KASLR memory randomization"
>>>> > 3416a21 Revert "x86/mm: Do not reference phys addr beyond kernel"
>>>> > 69227be Revert "mm: reorganize SLAB freelist randomization"
>>>> > a1d8d71 Revert "mm: SLUB freelist randomization"
>>>> >
>>>> > IOW, 021182e52f introduces a bug for which there is no existing fix yet.
>>>>
>>>> You mean it is something different from the previous KASLR bugs we saw?
>>>
>>> No, I just wanted to explicitly point out that "x86/power/64: Always
>>> create temporary identity mapping correctly" is not a fix for this issue.
>>
>> It is better to say that the $subject patch is not sufficient to fix
>> it, because I'm quite confident that it is necessary for that. :-)
>>
>> Without the $subject patch kernel_ident_mapping_init() makes
>> assumptions that simply are not met in the randomized identity mapping
>> base case.  Moreover, hibernation works for Thomas with $subject patch
>> applied, but it doesn't without it.
>>
>> So there is something else that we are missing.
>>
>> I have a murky suspicion, but it is really weird.  Namely, what if
>> restore_jump_address in set_up_temporary_text_mapping() happens to be
>> covered by the restore kernel's identity mapping?  Then, the image
>> kernel's entry point may get overwritten by something else in
>> core_restore_code().
>>
>> But is this possible even?  Thomas?
>
> I had a similar theory before when I was investigating the original
> crash. How is it avoided even without KASLR?

It doesn't have to be actively avoided then.  restore_jump_address is
a kernel text address and if __PAGE_OFFSET is the same for both the
restore and image kernels, it is guaranteed to be above the identity
mapping in both of them.

If the base of the identity mapping is randomized in both of them,
though, that may not be guaranteed any more.

> Given the space for the physical memory mapping, I doubt this issue
> would happen all the time though.

It should not, but it's not impossible for it to happen every time, at
least in a small number of attempts.

Thanks,
Rafael
Rafael J. Wysocki Aug. 9, 2016, 9:23 p.m. UTC | #17
On Tue, Aug 9, 2016 at 10:02 PM, Jiri Kosina <jikos@kernel.org> wrote:
> On Tue, 9 Aug 2016, Rafael J. Wysocki wrote:
>
>> I have a murky suspicion, but it is really weird.  Namely, what if
>> restore_jump_address in set_up_temporary_text_mapping() happens to be
>> covered by the restore kernel's identity mapping?  Then, the image
>> kernel's entry point may get overwritten by something else in
>> core_restore_code().
>
> So this made me to actually test a scenario where I'd suspend a kernel
> that's known-broken (i.e. contains 021182e52fe), and then have it resumed
> by a kernel that has 021182e52fe reverted. It resumed successfully.
>
> Just a datapoint.

That indicates the problem is somewhere in the restore kernel and no
surprises there.

I am able to reproduce the original problem (a triple fault on resume
with CONFIG_RANDOMIZE_MEMORY set) without the $subject patch, but the
patch fixes it for me.

Question is why it is not sufficient for you and Boris and the above
theory is about the only one I can come up with ATM.

I'm going to compare the configs etc, but I guess I just end up
writing a patch to test that theory unless someone has any other idea
in the meantime.

Thanks,
Rafael
diff mbox

Patch

Index: linux-pm/arch/x86/include/asm/init.h
===================================================================
--- linux-pm.orig/arch/x86/include/asm/init.h
+++ linux-pm/arch/x86/include/asm/init.h
@@ -5,10 +5,10 @@  struct x86_mapping_info {
 	void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
 	void *context;			 /* context for alloc_pgt_page */
 	unsigned long pmd_flag;		 /* page flag for PMD entry */
-	bool kernel_mapping;		 /* kernel mapping or ident mapping */
+	unsigned long offset;		 /* ident mapping offset */
 };
 
 int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
-				unsigned long addr, unsigned long end);
+				unsigned long pstart, unsigned long pend);
 
 #endif /* _ASM_X86_INIT_H */
Index: linux-pm/arch/x86/mm/ident_map.c
===================================================================
--- linux-pm.orig/arch/x86/mm/ident_map.c
+++ linux-pm/arch/x86/mm/ident_map.c
@@ -3,15 +3,17 @@ 
  * included by both the compressed kernel and the regular kernel.
  */
 
-static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page,
+static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page,
 			   unsigned long addr, unsigned long end)
 {
 	addr &= PMD_MASK;
 	for (; addr < end; addr += PMD_SIZE) {
 		pmd_t *pmd = pmd_page + pmd_index(addr);
 
-		if (!pmd_present(*pmd))
-			set_pmd(pmd, __pmd(addr | pmd_flag));
+		if (pmd_present(*pmd))
+			continue;
+
+		set_pmd(pmd, __pmd((addr - info->offset) | info->pmd_flag));
 	}
 }
 
@@ -30,13 +32,13 @@  static int ident_pud_init(struct x86_map
 
 		if (pud_present(*pud)) {
 			pmd = pmd_offset(pud, 0);
-			ident_pmd_init(info->pmd_flag, pmd, addr, next);
+			ident_pmd_init(info, pmd, addr, next);
 			continue;
 		}
 		pmd = (pmd_t *)info->alloc_pgt_page(info->context);
 		if (!pmd)
 			return -ENOMEM;
-		ident_pmd_init(info->pmd_flag, pmd, addr, next);
+		ident_pmd_init(info, pmd, addr, next);
 		set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
 	}
 
@@ -44,14 +46,15 @@  static int ident_pud_init(struct x86_map
 }
 
 int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
-			      unsigned long addr, unsigned long end)
+			      unsigned long pstart, unsigned long pend)
 {
+	unsigned long addr = pstart + info->offset;
+	unsigned long end = pend + info->offset;
 	unsigned long next;
 	int result;
-	int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0;
 
 	for (; addr < end; addr = next) {
-		pgd_t *pgd = pgd_page + pgd_index(addr) + off;
+		pgd_t *pgd = pgd_page + pgd_index(addr);
 		pud_t *pud;
 
 		next = (addr & PGDIR_MASK) + PGDIR_SIZE;
Index: linux-pm/arch/x86/power/hibernate_64.c
===================================================================
--- linux-pm.orig/arch/x86/power/hibernate_64.c
+++ linux-pm/arch/x86/power/hibernate_64.c
@@ -87,7 +87,7 @@  static int set_up_temporary_mappings(voi
 	struct x86_mapping_info info = {
 		.alloc_pgt_page	= alloc_pgt_page,
 		.pmd_flag	= __PAGE_KERNEL_LARGE_EXEC,
-		.kernel_mapping = true,
+		.offset		= __PAGE_OFFSET,
 	};
 	unsigned long mstart, mend;
 	pgd_t *pgd;