diff mbox

[RFC,v5] ARM hibernation / suspend-to-disk (fwd)

Message ID alpine.DEB.2.00.1106151349520.11405@localhost6.localdomain6 (mailing list archive)
State New, archived
Headers show

Commit Message

Frank Hofmann June 15, 2011, 1:35 p.m. UTC
On Mon, 13 Jun 2011, Russell King - ARM Linux wrote:

> On Mon, Jun 13, 2011 at 02:20:12PM +0100, Frank Hofmann wrote:
>>
>>
>> On Mon, 13 Jun 2011, Russell King - ARM Linux wrote:
>>
>>> On Mon, Jun 13, 2011 at 01:04:02PM +0100, Frank Hofmann wrote:
>>>>   To make it clear: IF AND ONLY IF your suspend(-to-ram) func looks like:
>>>>
>>>> 	ENTRY(acmeSoC_cpu_suspend)
>>>> 		stmfd	sp!, {r4-r12,lr}
>>>> 		ldr	r3, resume_mmu_done
>>>> 		bl	cpu_suspend
>>>> 	resume_mmu_done:
>>>> 		ldmfd	sp!, {r3-r12,pc}
>>>> 	ENDPROC(acmeSoC_cpu_suspend)
>>>
>>> Nothing has that - because you can't execute that ldmfd after the call
>>> to cpu_suspend returns.  I don't think you've understood what I said on
>>> that subject in the previous thread.
>>>
>>
>> Ok, to illustrate a bit more, what is ok and what not.
>
> Actually, we can do something about cpu_suspend.
>
> Currently cpu_suspend is not like a normal C function - when it's called
> it returns normally to a bunch of code which is not expected to return.
> The return path is via code pointed to by 'r3'.
>
> It also corrupts a bunch of registers in ways which make it non-compliant
> with a C API.
>
> If we do make this complaint as a normal C-like function, it eliminates
> this register saving.  We also swap 'lr' and 'r3', so cpu_suspend
> effectively only returns to following code on resume - and r3 points
> to the suspend code.

Hi Russell,


this change is perfect; with this, the hibernation support code turns into 
the attached.

That's both better and simpler to perform a full suspend/resume cycle (via 
resetting in the cpu_suspend "finisher") after the snapshot image has been 
created, instead of shoehorning a return into this.


I've tested the attached on v6 (samsung) and v7 (omap3 nonsecure - yes I 
know ...) kernels that have the cpu_suspend changes in you've posted 
Monday.

It does what it's supposed to do. And yes, aware that's not enough yet, 
this is only an attempt to address the "how to snapshot the core state", 
other outstanding issues will be addressed later.


It'd be great to know why this makes your toenails curl so much ...

Anyway, this again shows that as far as the core SoC state is concerned, 
the only difference between the hibernation snapshot and the s2ram soc 
suspend side are what the "finisher" does:

 	hibernate:		s2ram:

 	swsusp_save()		low power self-refresh settings, ...
 	<reset>			<poweroff>

Why would calling swsusp_save() not be acceptable in this context ? It's 
supposed to run in a restricted env (no ints, no scheduling, ...) anyway.


FrankH.

Comments

Matthieu CASTET June 29, 2011, 2:52 p.m. UTC | #1
Frank Hofmann a écrit :
> On Mon, 13 Jun 2011, Russell King - ARM Linux wrote:
> 
> Hi Russell,
> 
> 
> this change is perfect; with this, the hibernation support code turns into 
> the attached.
> 
> That's both better and simpler to perform a full suspend/resume cycle (via 
> resetting in the cpu_suspend "finisher") after the snapshot image has been 
> created, instead of shoehorning a return into this.
> 
> FrankH.


> 
> +
> +u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata;
It look like dangerous : there is no alignment constraint, but the stack should
be aligned on a 8 Bytes.


Matthieu
Frank Hofmann June 29, 2011, 3:14 p.m. UTC | #2
On Wed, 29 Jun 2011, Matthieu CASTET wrote:

> Frank Hofmann a écrit :
>> On Mon, 13 Jun 2011, Russell King - ARM Linux wrote:
>>
>> Hi Russell,
>>
>>
>> this change is perfect; with this, the hibernation support code turns into
>> the attached.
>>
>> That's both better and simpler to perform a full suspend/resume cycle (via
>> resetting in the cpu_suspend "finisher") after the snapshot image has been
>> created, instead of shoehorning a return into this.
>>
>> FrankH.
>
>
>>
>> +
>> +u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata;
> It look like dangerous : there is no alignment constraint, but the stack should
> be aligned on a 8 Bytes.

Uh - sorry. I used to have both the __nosavedata and 
__attribute__((__aligned__(PAGE_SIZE/2))) attributes there. That must've 
gone lost at one point.


It's an artifact of the build that things turn out ok by default; the 
__nosavedata forces a separate section (page), and arch/arm is linked 
before kernel/power (the other user of __nosavedata), hence this block, 
due to the way the kernel build works, ends up just fine.

But as you say, not by intent / declaration.


Have you seen Will Deacon's suggested kexec changes ? That keeps a "reset 
stack" page around, _elsewhere_, and I've been considering using that. In 
the end, all swsusp_arch_resume() really requires is a stack page that's 
guaranteed to be outside the target kernel data, thereby left alone by the 
restore. __nosavedata is merely one way.

>
>
> Matthieu
>
Thanks,

FrankH.
Matthieu CASTET July 5, 2011, 12:37 p.m. UTC | #3
Frank Hofmann a écrit :
> On Mon, 13 Jun 2011, Russell King - ARM Linux wrote:
> 
>> On Mon, Jun 13, 2011 at 02:20:12PM +0100, Frank Hofmann wrote:
>>>
>>> On Mon, 13 Jun 2011, Russell King - ARM Linux wrote:
>>>
>>>> On Mon, Jun 13, 2011 at 01:04:02PM +0100, Frank Hofmann wrote:
>>>>>   To make it clear: IF AND ONLY IF your suspend(-to-ram) func looks like:
>>>>>
>>>>> 	ENTRY(acmeSoC_cpu_suspend)
>>>>> 		stmfd	sp!, {r4-r12,lr}
>>>>> 		ldr	r3, resume_mmu_done
>>>>> 		bl	cpu_suspend
>>>>> 	resume_mmu_done:
>>>>> 		ldmfd	sp!, {r3-r12,pc}
>>>>> 	ENDPROC(acmeSoC_cpu_suspend)
>>>> Nothing has that - because you can't execute that ldmfd after the call
>>>> to cpu_suspend returns.  I don't think you've understood what I said on
>>>> that subject in the previous thread.
>>>>
>>> Ok, to illustrate a bit more, what is ok and what not.
>> Actually, we can do something about cpu_suspend.
>>
>> Currently cpu_suspend is not like a normal C function - when it's called
>> it returns normally to a bunch of code which is not expected to return.
>> The return path is via code pointed to by 'r3'.
>>
>> It also corrupts a bunch of registers in ways which make it non-compliant
>> with a C API.
>>
>> If we do make this complaint as a normal C-like function, it eliminates
>> this register saving.  We also swap 'lr' and 'r3', so cpu_suspend
>> effectively only returns to following code on resume - and r3 points
>> to the suspend code.
> 
> Hi Russell,
> 
> 
> this change is perfect; with this, the hibernation support code turns into 
> the attached.
> That's both better and simpler to perform a full suspend/resume cycle (via 
> resetting in the cpu_suspend "finisher") after the snapshot image has been 
> created, instead of shoehorning a return into this.
> 


> 
> +static void notrace __swsusp_arch_restore_image(void)
> +{
> +	extern struct pbe *restore_pblist;
> +	struct pbe *pbe;
> +
> +	cpu_switch_mm(swapper_pg_dir, &init_mm);
> +
> +	for (pbe = restore_pblist; pbe; pbe = pbe->next)
> +		copy_page(pbe->orig_address, pbe->address);
> +

One question : isn't dangerous to modify the code where we are running ?

I believe the code shouldn't change too much between the kernel that do the
resume and the resumed kernel and the copy routine should fit in the instruction
cache, but I want to be sure it doesn't cause any problem on recent arm cores
(instruction prefetching , ...)


Matthieu
diff mbox

Patch

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index af44a8f..e78d4f5 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -246,6 +246,7 @@  static inline void *phys_to_virt(phys_addr_t x)
  */
 #define __pa(x)			__virt_to_phys((unsigned long)(x))
 #define __va(x)			((void *)__phys_to_virt((unsigned long)(x)))
+#define __pa_symbol(x)		__pa(RELOC_HIDE((unsigned long)(x),0))
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
 
 /*
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index a5b31af..c64d715 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -30,6 +30,7 @@  obj-$(CONFIG_ARTHUR)		+= arthur.o
 obj-$(CONFIG_ISA_DMA)		+= dma-isa.o
 obj-$(CONFIG_PCI)		+= bios32.o isa.o
 obj-$(CONFIG_PM_SLEEP)		+= sleep.o
+obj-$(CONFIG_HIBERNATION)	+= hibernate.o
 obj-$(CONFIG_HAVE_SCHED_CLOCK)	+= sched_clock.o
 obj-$(CONFIG_SMP)		+= smp.o smp_tlb.o
 obj-$(CONFIG_HAVE_ARM_SCU)	+= smp_scu.o
diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c
new file mode 100644
index 0000000..bf1d2ef
--- /dev/null
+++ b/arch/arm/kernel/hibernate.c
@@ -0,0 +1,152 @@ 
+/*
+ * Hibernation support specific for ARM
+ *
+ * Derived from work on ARM hibernation support by:
+ *
+ * Ubuntu project, hibernation support for mach-dove
+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)
+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)
+ *	https://lkml.org/lkml/2010/6/18/4
+ *	https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html
+ *	https://patchwork.kernel.org/patch/96442/
+ *
+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * License terms: GNU General Public License (GPL) version 2
+ */
+
+#include <linux/mm.h>
+#include <linux/suspend.h>
+#include <asm/tlbflush.h>
+#include <asm/cacheflush.h>
+
+extern const void __nosave_begin, __nosave_end;
+
+int pfn_is_nosave(unsigned long pfn)
+{
+	unsigned long nosave_begin_pfn = __pa_symbol(&__nosave_begin) >> PAGE_SHIFT;
+	unsigned long nosave_end_pfn = PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT;
+
+	return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn);
+}
+
+void notrace save_processor_state(void)
+{
+	WARN_ON(num_online_cpus() != 1);
+	flush_thread();
+	local_fiq_disable();
+}
+
+void notrace restore_processor_state(void)
+{
+	local_fiq_enable();
+}
+
+u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata;
+static int __swsusp_arch_resume_ret;
+
+static void notrace __swsusp_arch_reset(void)
+{
+	extern void cpu_resume(void);
+	void (*phys_reset)(unsigned long) =
+		(void (*)(unsigned long))virt_to_phys(cpu_reset);
+
+	flush_tlb_all();
+	flush_cache_all();
+
+	identity_mapping_add(swapper_pg_dir, 0, TASK_SIZE);
+	flush_tlb_all();
+	flush_cache_all();
+
+	cpu_proc_fin();
+	flush_tlb_all();
+	flush_cache_all();
+
+	phys_reset(virt_to_phys(cpu_resume));
+}
+
+static int __swsusp_arch_resume_finish(void)
+{
+	cpu_init();
+	identity_mapping_del(swapper_pg_dir, 0, TASK_SIZE);
+	flush_tlb_all();
+	flush_cache_all();
+
+	return __swsusp_arch_resume_ret;
+}
+
+/*
+ * Snapshot kernel memory and reset the system.
+ * After resume, the hibernation snapshot is written out.
+ */
+static void notrace __swsusp_arch_save_image(void)
+{
+	extern int swsusp_save(void);
+
+	__swsusp_arch_resume_ret = swsusp_save();
+
+	cpu_switch_mm(swapper_pg_dir, &init_mm);
+	__swsusp_arch_reset();
+}
+
+/*
+ * The framework loads the hibernation image into a linked list anchored
+ * at restore_pblist, for swsusp_arch_resume() to copy back to the proper
+ * destinations.
+ *
+ * To make this work if resume is triggered from initramfs, the
+ * pagetables need to be switched to allow writes to kernel mem.
+ */
+static void notrace __swsusp_arch_restore_image(void)
+{
+	extern struct pbe *restore_pblist;
+	struct pbe *pbe;
+
+	cpu_switch_mm(swapper_pg_dir, &init_mm);
+
+	for (pbe = restore_pblist; pbe; pbe = pbe->next)
+		copy_page(pbe->orig_address, pbe->address);
+
+	/*
+	 * Done - reset and resume from the hibernation image.
+	 */
+	__swsusp_arch_resume_ret = 0;	/* success at this point */
+	__swsusp_arch_reset();
+}
+
+/*
+ * Save the current CPU state before suspend / poweroff.
+ */
+int notrace swsusp_arch_suspend(void)
+{
+	extern void cpu_suspend(unsigned long, unsigned long, unsigned long, void *);
+
+	cpu_suspend(0, virt_to_phys(0), 0, __swsusp_arch_save_image);
+
+	/*
+	 * After resume, execution restarts here. Clean up and return.
+	 */
+	return __swsusp_arch_resume_finish();
+}
+
+/*
+ * Resume from the hibernation image.
+ * Due to the kernel heap / data restore, stack contents change underneath
+ * and that would make function calls impossible; switch to a temporary
+ * stack within the nosave region to avoid that problem.
+ */
+int __naked swsusp_arch_resume(void)
+{
+	cpu_init();	/* get a clean PSR */
+
+	/*
+	 * when switch_stack() becomes available, should use that
+	 */
+	__asm__ __volatile__("mov	sp, %0\n\t"
+		: : "r"(__swsusp_resume_stk + sizeof(__swsusp_resume_stk))
+		: "memory", "cc");
+
+	__swsusp_arch_restore_image();
+
+	return 0;
+}
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 0074b8d..c668f8f 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -627,6 +627,11 @@  config CPU_USE_DOMAINS
 config IO_36
 	bool
 
+config ARCH_HIBERNATION_POSSIBLE
+	bool
+	depends on MMU
+	default y if CPU_ARM920T || CPU_ARM926T || CPU_SA1100 || CPU_XSCALE || CPU_XSC3 || CPU_V6 || CPU_V6K || CPU_V7
+
 comment "Processor Features"
 
 config ARM_THUMB