diff mbox series

[3/3] RISC-V: Add arch functions to support hibernation/suspend-to-disk

Message ID 20230106060535.104321-4-jeeheng.sia@starfivetech.com (mailing list archive)
State Superseded
Headers show
Series RISC-V Hibernation Support | expand

Checks

Context Check Description
conchuod/patch_count success Link
conchuod/cover_letter success Series has a cover letter
conchuod/tree_selection success Guessed tree name to be for-next
conchuod/fixes_present success Fixes tag not required for -next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 13 and now 13
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/module_param success Was 0 now: 0
conchuod/alphanumeric_selects success Out of order selects before the patch: 57 and now 57
conchuod/build_rv32_defconfig fail Build failed
conchuod/build_warn_rv64 fail Failed to build the tree with this patch.
conchuod/dtb_warn_rv64 success Errors and warnings before: 0 this patch: 0
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch warning CHECK: Alignment should match open parenthesis CHECK: Blank lines aren't necessary after an open brace '{' CHECK: Consider using #include <linux/cacheflush.h> instead of <asm/cacheflush.h> CHECK: Consider using #include <linux/kexec.h> instead of <asm/kexec.h> CHECK: Consider using #include <linux/mmu_context.h> instead of <asm/mmu_context.h> CHECK: Consider using #include <linux/smp.h> instead of <asm/smp.h> CHECK: No space is necessary after a cast WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
conchuod/source_inline fail Was 0 now: 1
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success No Fixes tag
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Sia Jee Heng Jan. 6, 2023, 6:05 a.m. UTC
Low level Arch functions were created to support hibernation.
swsusp_arch_suspend() relies code from __cpu_suspend_enter() to write
cpu state onto the stack, then calling swsusp_save() to save the memory
image.

arch_hibernation_header_restore() and arch_hibernation_header_save()
functions are implemented to prevent kernel crash when resume,
the kernel built version is saved into the hibernation image header
to making sure only the same kernel is restore when resume.

swsusp_arch_resume() creates a temporary page table that covering only
the linear map, copies the restore code to a 'safe' page, then start to
restore the memory image. Once completed, it restores the original
kernel's page table. It then calls into __hibernate_cpu_resume()
to restore the CPU context. Finally, it follows the normal hibernation
path back to the hibernation core.

To enable hibernation/suspend to disk into RISCV, the below config
need to be enabled:
- CONFIG_ARCH_HIBERNATION_HEADER
- CONFIG_ARCH_HIBERNATION_POSSIBLE

Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com>
Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
Reviewed-by: Mason Huo <mason.huo@starfivetech.com>
---
 arch/riscv/Kconfig                |   7 +
 arch/riscv/include/asm/suspend.h  |  20 ++
 arch/riscv/kernel/Makefile        |   2 +-
 arch/riscv/kernel/asm-offsets.c   |   5 +
 arch/riscv/kernel/hibernate-asm.S | 123 +++++++++++
 arch/riscv/kernel/hibernate.c     | 353 ++++++++++++++++++++++++++++++
 6 files changed, 509 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/kernel/hibernate-asm.S
 create mode 100644 arch/riscv/kernel/hibernate.c

Comments

kernel test robot Jan. 6, 2023, 9:18 a.m. UTC | #1
Hi Sia,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on 1f5abbd77e2c1787e74b7c2caffac97def78ba52]

url:    https://github.com/intel-lab-lkp/linux/commits/Sia-Jee-Heng/RISC-V-Change-suspend_save_csrs-and-suspend_restore_csrs-to-public-function/20230106-140835
base:   1f5abbd77e2c1787e74b7c2caffac97def78ba52
patch link:    https://lore.kernel.org/r/20230106060535.104321-4-jeeheng.sia%40starfivetech.com
patch subject: [PATCH 3/3] RISC-V: Add arch functions to support hibernation/suspend-to-disk
config: riscv-allyesconfig
compiler: riscv64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/6ebc346a144af3bb328e7931c03d279f7797543c
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Sia-Jee-Heng/RISC-V-Change-suspend_save_csrs-and-suspend_restore_csrs-to-public-function/20230106-140835
        git checkout 6ebc346a144af3bb328e7931c03d279f7797543c
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash arch/riscv/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from arch/riscv/kernel/hibernate.c:12:
   arch/riscv/include/asm/kexec.h:62:38: error: unknown type name 'Elf_Shdr'
      62 |                                      Elf_Shdr *section,
         |                                      ^~~~~~~~
   arch/riscv/include/asm/kexec.h:63:44: error: unknown type name 'Elf_Shdr'
      63 |                                      const Elf_Shdr *relsec,
         |                                            ^~~~~~~~
   arch/riscv/include/asm/kexec.h:64:44: error: unknown type name 'Elf_Shdr'
      64 |                                      const Elf_Shdr *symtab);
         |                                            ^~~~~~~~
>> arch/riscv/kernel/hibernate.c:175:15: warning: no previous prototype for 'temp_pgtable_map_pte' [-Wmissing-prototypes]
     175 | unsigned long temp_pgtable_map_pte(pte_t *ptep, unsigned long vaddr, pgprot_t prot)
         |               ^~~~~~~~~~~~~~~~~~~~
>> arch/riscv/kernel/hibernate.c:184:15: warning: no previous prototype for 'temp_pgtable_map_pmd' [-Wmissing-prototypes]
     184 | unsigned long temp_pgtable_map_pmd(pmd_t *pmdp, unsigned long vaddr, pgprot_t prot)
         |               ^~~~~~~~~~~~~~~~~~~~
>> arch/riscv/kernel/hibernate.c:203:15: warning: no previous prototype for 'temp_pgtable_map_pud' [-Wmissing-prototypes]
     203 | unsigned long temp_pgtable_map_pud(pud_t *pudp, unsigned long vaddr, pgprot_t prot)
         |               ^~~~~~~~~~~~~~~~~~~~
>> arch/riscv/kernel/hibernate.c:223:15: warning: no previous prototype for 'temp_pgtable_map_p4d' [-Wmissing-prototypes]
     223 | unsigned long temp_pgtable_map_p4d(p4d_t *p4dp, unsigned long vaddr, pgprot_t prot)
         |               ^~~~~~~~~~~~~~~~~~~~
>> arch/riscv/kernel/hibernate.c:242:15: warning: no previous prototype for 'temp_pgtable_map_pgd' [-Wmissing-prototypes]
     242 | unsigned long temp_pgtable_map_pgd(pgd_t *pgdp, unsigned long vaddr, pgprot_t prot)
         |               ^~~~~~~~~~~~~~~~~~~~
>> arch/riscv/kernel/hibernate.c:261:15: warning: no previous prototype for 'temp_pgtable_mapping' [-Wmissing-prototypes]
     261 | unsigned long temp_pgtable_mapping(pgd_t *pgdp, unsigned long vaddr, pgprot_t prot)
         |               ^~~~~~~~~~~~~~~~~~~~
>> arch/riscv/kernel/hibernate.c:266:15: warning: no previous prototype for 'relocate_restore_code' [-Wmissing-prototypes]
     266 | unsigned long relocate_restore_code(void)
         |               ^~~~~~~~~~~~~~~~~~~~~


vim +/temp_pgtable_map_pte +175 arch/riscv/kernel/hibernate.c

   167	
   168	#define temp_pgtable_map_pgd_next(pgdp, vaddr, prot)			\
   169			(pgtable_l5_enabled ?					\
   170			temp_pgtable_map_p4d(pgdp, vaddr, prot) :		\
   171			(pgtable_l4_enabled ?					\
   172			temp_pgtable_map_pud((pud_t *)pgdp, vaddr, prot) :	\
   173			temp_pgtable_map_pmd((pmd_t *)pgdp, vaddr, prot)))
   174	
 > 175	unsigned long temp_pgtable_map_pte(pte_t *ptep, unsigned long vaddr, pgprot_t prot)
   176	{
   177		uintptr_t pte_idx = pte_index(vaddr);
   178	
   179		ptep[pte_idx] = pfn_pte(PFN_DOWN(__pa(vaddr)), prot);
   180	
   181		return 0;
   182	}
   183	
 > 184	unsigned long temp_pgtable_map_pmd(pmd_t *pmdp, unsigned long vaddr, pgprot_t prot)
   185	{
   186		uintptr_t pmd_idx = pmd_index(vaddr);
   187		pte_t *ptep;
   188	
   189		if (pmd_none(pmdp[pmd_idx])) {
   190			ptep = (pte_t *) get_safe_page(GFP_ATOMIC);
   191			if (!ptep)
   192				return -ENOMEM;
   193	
   194			memset(ptep, 0, PAGE_SIZE);
   195			pmdp[pmd_idx] = pfn_pmd(PFN_DOWN(__pa(ptep)), PAGE_TABLE);
   196		} else {
   197			ptep = (pte_t *) __va(PFN_PHYS(_pmd_pfn(pmdp[pmd_idx])));
   198		}
   199	
   200		return temp_pgtable_map_pte(ptep, vaddr, prot);
   201	}
   202	
 > 203	unsigned long temp_pgtable_map_pud(pud_t *pudp, unsigned long vaddr, pgprot_t prot)
   204	{
   205	
   206		uintptr_t pud_index = pud_index(vaddr);
   207		pmd_t *pmdp;
   208	
   209		if (pud_val(pudp[pud_index]) == 0) {
   210			pmdp = (pmd_t *) get_safe_page(GFP_ATOMIC);
   211			if (!pmdp)
   212				return -ENOMEM;
   213	
   214			memset(pmdp, 0, PAGE_SIZE);
   215			pudp[pud_index] = pfn_pud(PFN_DOWN(__pa(pmdp)), PAGE_TABLE);
   216		} else {
   217			pmdp = (pmd_t *) __va(PFN_PHYS(_pud_pfn(pudp[pud_index])));
   218		}
   219	
   220		return temp_pgtable_map_pmd(pmdp, vaddr, prot);
   221	}
   222	
 > 223	unsigned long temp_pgtable_map_p4d(p4d_t *p4dp, unsigned long vaddr, pgprot_t prot)
   224	{
   225		uintptr_t p4d_index = p4d_index(vaddr);
   226		pud_t *pudp;
   227	
   228		if (p4d_val(p4dp[p4d_index]) == 0) {
   229			pudp = (pud_t *) get_safe_page(GFP_ATOMIC);
   230			if (!pudp)
   231				return -ENOMEM;
   232	
   233			memset(pudp, 0, PAGE_SIZE);
   234			p4dp[p4d_index] = pfn_p4d(PFN_DOWN(__pa(pudp)), PAGE_TABLE);
   235		} else {
   236			pudp = (pud_t *) __va(PFN_PHYS(_p4d_pfn(p4dp[p4d_index])));
   237		}
   238	
   239		return temp_pgtable_map_pud(pudp, vaddr, prot);
   240	}
   241	
 > 242	unsigned long temp_pgtable_map_pgd(pgd_t *pgdp, unsigned long vaddr, pgprot_t prot)
   243	{
   244		uintptr_t pgd_idx = pgd_index(vaddr);
   245		void *nextp;
   246	
   247		if (pgd_val(pgdp[pgd_idx]) == 0) {
   248			nextp = (void *) get_safe_page(GFP_ATOMIC);
   249			if (!nextp)
   250				return -ENOMEM;
   251	
   252			memset(nextp, 0, PAGE_SIZE);
   253			pgdp[pgd_idx] = pfn_pgd(PFN_DOWN(__pa(nextp)), PAGE_TABLE);
   254		} else {
   255			nextp = (void *) __va(PFN_PHYS(_pgd_pfn(pgdp[pgd_idx])));
   256		}
   257	
   258		return temp_pgtable_map_pgd_next(nextp, vaddr, prot);
   259	}
   260	
 > 261	unsigned long temp_pgtable_mapping(pgd_t *pgdp, unsigned long vaddr, pgprot_t prot)
   262	{
   263		return temp_pgtable_map_pgd(pgdp, vaddr, prot);
   264	}
   265	
 > 266	unsigned long relocate_restore_code(void)
   267	{
   268		unsigned long ret;
   269		void *page = (void *)get_safe_page(GFP_ATOMIC);
   270	
   271		if (!page)
   272			return -ENOMEM;
   273	
   274		copy_page(page, core_restore_code);
   275	
   276		/* Make the page containing the relocated code executable */
   277		set_memory_x((unsigned long)page, 1);
   278	
   279		ret = temp_pgtable_mapping(resume_pg_dir, (unsigned long)page, PAGE_KERNEL_READ_EXEC);
   280		if (ret)
   281			return ret;
   282	
   283		return (unsigned long)page;
   284	}
   285
Pavel Machek Feb. 18, 2023, 2:01 p.m. UTC | #2
Hi!

> @@ -21,6 +21,11 @@ struct suspend_context {
>  #endif
>  };
>  
> +/* This parameter will be assigned to 0 during resume and will be used by
> + * hibernation core for the subsequent resume sequence
> + */
> +extern int in_suspend;

"/*" should be on separate line. (This occurs multiple times.)

> +ENTRY(core_restore_code)
> +	/* switch to temp page table */
> +	csrw satp, s1
> +	sfence.vma
> +	beqz	s4, done
> +loop:
> +	/* The below code will restore the hibernated image. */
> +	ld	a1, HIBERN_PBE_ADDR(s4)
> +	ld	a0, HIBERN_PBE_ORIG(s4)
> +
> +	lui     a4, 0x1
> +	add     a4, a4, a0

Do you have enough registers to move a4 = 0x1 assignment before the
loop:?

> +static int __init riscv_hibernate__init(void)
> +{
> +	hibernate_cpu_context = kcalloc(1, sizeof(struct suspend_context), GFP_KERNEL);
> +

kzalloc?

> +	if (WARN_ON(!hibernate_cpu_context))
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +early_initcall(riscv_hibernate__init);

Does this really need to be done early?

Best regards,
								Pavel
Sia Jee Heng Feb. 20, 2023, 4:21 a.m. UTC | #3
Hi,

This is the old series, could you review the latest series at: https://patchwork.kernel.org/project/linux-riscv/list/?series=716143 ?

> -----Original Message-----
> From: Pavel Machek <pavel@ucw.cz>
> Sent: Saturday, 18 February, 2023 10:01 PM
> To: JeeHeng Sia <jeeheng.sia@starfivetech.com>
> Cc: paul.walmsley@sifive.com; palmer@dabbelt.com; aou@eecs.berkeley.edu; linux-riscv@lists.infradead.org; linux-
> kernel@vger.kernel.org; Leyfoon Tan <leyfoon.tan@starfivetech.com>; Mason Huo <mason.huo@starfivetech.com>
> Subject: Re: [PATCH 3/3] RISC-V: Add arch functions to support hibernation/suspend-to-disk
> 
> Hi!
> 
> > @@ -21,6 +21,11 @@ struct suspend_context {
> >  #endif
> >  };
> >
> > +/* This parameter will be assigned to 0 during resume and will be used by
> > + * hibernation core for the subsequent resume sequence
> > + */
> > +extern int in_suspend;
> 
> "/*" should be on separate line. (This occurs multiple times.)
already resolved in v3 series.
> 
> > +ENTRY(core_restore_code)
> > +	/* switch to temp page table */
> > +	csrw satp, s1
> > +	sfence.vma
> > +	beqz	s4, done
> > +loop:
> > +	/* The below code will restore the hibernated image. */
> > +	ld	a1, HIBERN_PBE_ADDR(s4)
> > +	ld	a0, HIBERN_PBE_ORIG(s4)
> > +
> > +	lui     a4, 0x1
> > +	add     a4, a4, a0
> 
> Do you have enough registers to move a4 = 0x1 assignment before the
> loop:?
I am sorry but I don't understand the question. Could you elaborate please?
> 
> > +static int __init riscv_hibernate__init(void)
> > +{
> > +	hibernate_cpu_context = kcalloc(1, sizeof(struct suspend_context), GFP_KERNEL);
> > +
> 
> kzalloc?
sure, will change it.
> 
> > +	if (WARN_ON(!hibernate_cpu_context))
> > +		return -ENOMEM;
> > +
> > +	return 0;
> > +}
> > +
> > +early_initcall(riscv_hibernate__init);
> 
> Does this really need to be done early?
yes. To make sure it is initialized before next resume.
> 
> Best regards,
> 								Pavel
> --
> People of Russia, stop Putin before his war on Ukraine escalates.
Sia Jee Heng Feb. 21, 2023, 3:16 a.m. UTC | #4
Hi,

In case if you are not aware, the latest series (v4) can be found at: https://patchwork.kernel.org/project/linux-riscv/list/?series=723586

Thanks
Regards
Jee Heng

> -----Original Message-----
> From: JeeHeng Sia
> Sent: Monday, 20 February, 2023 12:22 PM
> To: 'Pavel Machek' <pavel@ucw.cz>
> Cc: paul.walmsley@sifive.com; palmer@dabbelt.com; aou@eecs.berkeley.edu; linux-riscv@lists.infradead.org; linux-
> kernel@vger.kernel.org; Leyfoon Tan <leyfoon.tan@starfivetech.com>; Mason Huo <mason.huo@starfivetech.com>
> Subject: RE: [PATCH 3/3] RISC-V: Add arch functions to support hibernation/suspend-to-disk
> 
> Hi,
> 
> This is the old series, could you review the latest series at: https://patchwork.kernel.org/project/linux-riscv/list/?series=716143 ?
> 
> > -----Original Message-----
> > From: Pavel Machek <pavel@ucw.cz>
> > Sent: Saturday, 18 February, 2023 10:01 PM
> > To: JeeHeng Sia <jeeheng.sia@starfivetech.com>
> > Cc: paul.walmsley@sifive.com; palmer@dabbelt.com; aou@eecs.berkeley.edu; linux-riscv@lists.infradead.org; linux-
> > kernel@vger.kernel.org; Leyfoon Tan <leyfoon.tan@starfivetech.com>; Mason Huo <mason.huo@starfivetech.com>
> > Subject: Re: [PATCH 3/3] RISC-V: Add arch functions to support hibernation/suspend-to-disk
> >
> > Hi!
> >
> > > @@ -21,6 +21,11 @@ struct suspend_context {
> > >  #endif
> > >  };
> > >
> > > +/* This parameter will be assigned to 0 during resume and will be used by
> > > + * hibernation core for the subsequent resume sequence
> > > + */
> > > +extern int in_suspend;
> >
> > "/*" should be on separate line. (This occurs multiple times.)
> already resolved in v3 series.
> >
> > > +ENTRY(core_restore_code)
> > > +	/* switch to temp page table */
> > > +	csrw satp, s1
> > > +	sfence.vma
> > > +	beqz	s4, done
> > > +loop:
> > > +	/* The below code will restore the hibernated image. */
> > > +	ld	a1, HIBERN_PBE_ADDR(s4)
> > > +	ld	a0, HIBERN_PBE_ORIG(s4)
> > > +
> > > +	lui     a4, 0x1
> > > +	add     a4, a4, a0
> >
> > Do you have enough registers to move a4 = 0x1 assignment before the
> > loop:?
> I am sorry but I don't understand the question. Could you elaborate please?
> >
> > > +static int __init riscv_hibernate__init(void)
> > > +{
> > > +	hibernate_cpu_context = kcalloc(1, sizeof(struct suspend_context), GFP_KERNEL);
> > > +
> >
> > kzalloc?
> sure, will change it.
> >
> > > +	if (WARN_ON(!hibernate_cpu_context))
> > > +		return -ENOMEM;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +early_initcall(riscv_hibernate__init);
> >
> > Does this really need to be done early?
> yes. To make sure it is initialized before next resume.
> >
> > Best regards,
> > 								Pavel
> > --
> > People of Russia, stop Putin before his war on Ukraine escalates.
diff mbox series

Patch

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index e2b656043abf..50474d3aa62e 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -690,6 +690,13 @@  menu "Power management options"
 
 source "kernel/power/Kconfig"
 
+config ARCH_HIBERNATION_POSSIBLE
+       def_bool y
+
+config ARCH_HIBERNATION_HEADER
+       def_bool y
+       depends on HIBERNATION
+
 endmenu # "Power management options"
 
 menu "CPU Power Management"
diff --git a/arch/riscv/include/asm/suspend.h b/arch/riscv/include/asm/suspend.h
index 75419c5ca272..ebaf103aec40 100644
--- a/arch/riscv/include/asm/suspend.h
+++ b/arch/riscv/include/asm/suspend.h
@@ -21,6 +21,11 @@  struct suspend_context {
 #endif
 };
 
+/* This parameter will be assigned to 0 during resume and will be used by
+ * hibernation core for the subsequent resume sequence
+ */
+extern int in_suspend;
+
 /* Low-level CPU suspend entry function */
 int __cpu_suspend_enter(struct suspend_context *context);
 
@@ -36,4 +41,19 @@  int __cpu_resume_enter(unsigned long hartid, unsigned long context);
 /* Used to save and restore the csr */
 void suspend_save_csrs(struct suspend_context *context);
 void suspend_restore_csrs(struct suspend_context *context);
+
+/* Low-level API to support hibernation */
+int swsusp_arch_suspend(void);
+int swsusp_arch_resume(void);
+int arch_hibernation_header_save(void *addr, unsigned int max_size);
+int arch_hibernation_header_restore(void *addr);
+int __hibernate_cpu_resume(void);
+
+/* Used to resume on the CPU we hibernated on */
+int hibernate_resume_nonboot_cpu_disable(void);
+
+/* Used to restore the hibernated image */
+asmlinkage void restore_image(unsigned long resume_satp, unsigned long satp_temp,
+				unsigned long cpu_resume);
+asmlinkage int core_restore_code(void);
 #endif
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 4cf303a779ab..df83b8cea631 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -64,7 +64,7 @@  obj-$(CONFIG_MODULES)		+= module.o
 obj-$(CONFIG_MODULE_SECTIONS)	+= module-sections.o
 
 obj-$(CONFIG_CPU_PM)		+= suspend_entry.o suspend.o
-
+obj-$(CONFIG_HIBERNATION)	+= hibernate.o hibernate-asm.o
 obj-$(CONFIG_FUNCTION_TRACER)	+= mcount.o ftrace.o
 obj-$(CONFIG_DYNAMIC_FTRACE)	+= mcount-dyn.o
 
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index df9444397908..d6a75aac1d27 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -9,6 +9,7 @@ 
 #include <linux/kbuild.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
+#include <linux/suspend.h>
 #include <asm/kvm_host.h>
 #include <asm/thread_info.h>
 #include <asm/ptrace.h>
@@ -116,6 +117,10 @@  void asm_offsets(void)
 
 	OFFSET(SUSPEND_CONTEXT_REGS, suspend_context, regs);
 
+	OFFSET(HIBERN_PBE_ADDR, pbe, address);
+	OFFSET(HIBERN_PBE_ORIG, pbe, orig_address);
+	OFFSET(HIBERN_PBE_NEXT, pbe, next);
+
 	OFFSET(KVM_ARCH_GUEST_ZERO, kvm_vcpu_arch, guest_context.zero);
 	OFFSET(KVM_ARCH_GUEST_RA, kvm_vcpu_arch, guest_context.ra);
 	OFFSET(KVM_ARCH_GUEST_SP, kvm_vcpu_arch, guest_context.sp);
diff --git a/arch/riscv/kernel/hibernate-asm.S b/arch/riscv/kernel/hibernate-asm.S
new file mode 100644
index 000000000000..81d9dc98d0ad
--- /dev/null
+++ b/arch/riscv/kernel/hibernate-asm.S
@@ -0,0 +1,123 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Hibernation support specific for RISCV
+ *
+ * Copyright (C) 2023 StarFive Technology Co., Ltd.
+ *
+ * Author: Jee Heng Sia <jeeheng.sia@starfivetech.com>
+ */
+
+#include <asm/asm.h>
+#include <asm/asm-offsets.h>
+#include <asm/csr.h>
+#include <linux/linkage.h>
+
+/*
+ * These code are to be executed when resume from the hibernation.
+ *
+ * It begins with loads the temporary page table then restores the memory image.
+ * Finally branches to __hibernate_cpu_resume() to restore the state saved by
+ * swsusp_arch_suspend().
+ */
+
+/*
+ * int __hibernate_cpu_resume(void)
+ * Switch back to the hibernated image's page table prior to restore the CPU
+ * context.
+ *
+ * Always returns 0 to the C code.
+ */
+ENTRY(__hibernate_cpu_resume)
+        /* switch to hibernated image's page table */
+        csrw CSR_SATP, s0
+        sfence.vma
+
+	ld	a0, hibernate_cpu_context
+
+	/* Restore CSRs */
+	REG_L   t0, (SUSPEND_CONTEXT_REGS + PT_EPC)(a0)
+	csrw    CSR_EPC, t0
+	REG_L   t0, (SUSPEND_CONTEXT_REGS + PT_STATUS)(a0)
+	csrw    CSR_STATUS, t0
+	REG_L   t0, (SUSPEND_CONTEXT_REGS + PT_BADADDR)(a0)
+	csrw    CSR_TVAL, t0
+	REG_L   t0, (SUSPEND_CONTEXT_REGS + PT_CAUSE)(a0)
+	csrw    CSR_CAUSE, t0
+
+	/* Restore registers (except A0 and T0-T6) */
+	REG_L   ra, (SUSPEND_CONTEXT_REGS + PT_RA)(a0)
+	REG_L   sp, (SUSPEND_CONTEXT_REGS + PT_SP)(a0)
+	REG_L   gp, (SUSPEND_CONTEXT_REGS + PT_GP)(a0)
+	REG_L   tp, (SUSPEND_CONTEXT_REGS + PT_TP)(a0)
+
+	REG_L   s0, (SUSPEND_CONTEXT_REGS + PT_S0)(a0)
+	REG_L   s1, (SUSPEND_CONTEXT_REGS + PT_S1)(a0)
+	REG_L   a1, (SUSPEND_CONTEXT_REGS + PT_A1)(a0)
+	REG_L   a2, (SUSPEND_CONTEXT_REGS + PT_A2)(a0)
+	REG_L   a3, (SUSPEND_CONTEXT_REGS + PT_A3)(a0)
+	REG_L   a4, (SUSPEND_CONTEXT_REGS + PT_A4)(a0)
+	REG_L   a5, (SUSPEND_CONTEXT_REGS + PT_A5)(a0)
+	REG_L   a6, (SUSPEND_CONTEXT_REGS + PT_A6)(a0)
+	REG_L   a7, (SUSPEND_CONTEXT_REGS + PT_A7)(a0)
+	REG_L   s2, (SUSPEND_CONTEXT_REGS + PT_S2)(a0)
+	REG_L   s3, (SUSPEND_CONTEXT_REGS + PT_S3)(a0)
+	REG_L   s4, (SUSPEND_CONTEXT_REGS + PT_S4)(a0)
+	REG_L   s5, (SUSPEND_CONTEXT_REGS + PT_S5)(a0)
+	REG_L   s6, (SUSPEND_CONTEXT_REGS + PT_S6)(a0)
+	REG_L   s7, (SUSPEND_CONTEXT_REGS + PT_S7)(a0)
+	REG_L   s8, (SUSPEND_CONTEXT_REGS + PT_S8)(a0)
+	REG_L   s9, (SUSPEND_CONTEXT_REGS + PT_S9)(a0)
+	REG_L   s10, (SUSPEND_CONTEXT_REGS + PT_S10)(a0)
+	REG_L   s11, (SUSPEND_CONTEXT_REGS + PT_S11)(a0)
+
+	/* Return zero value */
+	add     a0, zero, zero
+
+	ret
+END(__hibernate_cpu_resume)
+
+/*
+ * Prepare to restore the image.
+ * a0: satp of saved page tables
+ * a1: satp of temporary page tables
+ * a2: cpu_resume
+ */
+ENTRY(restore_image)
+	mv	s0, a0
+	mv	s1, a1
+	mv	s2, a2
+	ld      s4, restore_pblist
+	ld	a1, relocated_restore_code
+
+	jalr	a1
+END(restore_image)
+
+/*
+ * The below code will be executed from a 'safe' page.
+ * It first switches to the temporary page table, then start to copy the pages
+ * back to the original memory location. Finally, it jumps to the __hibernate_cpu_resume()
+ * to restore the CPU context.
+ */
+ENTRY(core_restore_code)
+	/* switch to temp page table */
+	csrw satp, s1
+	sfence.vma
+	beqz	s4, done
+loop:
+	/* The below code will restore the hibernated image. */
+	ld	a1, HIBERN_PBE_ADDR(s4)
+	ld	a0, HIBERN_PBE_ORIG(s4)
+
+	lui     a4, 0x1
+	add     a4, a4, a0
+copy:	ld      a5, 0(a1)
+	addi    a0, a0, 8
+	addi    a1, a1, 8
+	sd      a5, -8(a0)
+	bne     a4, a0, copy
+
+	ld	s4, HIBERN_PBE_NEXT(s4)
+	bnez	s4, loop
+done:
+	jalr	s2
+END(core_restore_code)
diff --git a/arch/riscv/kernel/hibernate.c b/arch/riscv/kernel/hibernate.c
new file mode 100644
index 000000000000..78ffad3bdbd8
--- /dev/null
+++ b/arch/riscv/kernel/hibernate.c
@@ -0,0 +1,353 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Hibernation support specific for RISCV
+ *
+ * Copyright (C) 2023 StarFive Technology Co., Ltd.
+ *
+ * Author: Jee Heng Sia <jeeheng.sia@starfivetech.com>
+ */
+
+#include <asm/barrier.h>
+#include <asm/cacheflush.h>
+#include <asm/kexec.h>
+#include <asm/mmu_context.h>
+#include <asm/page.h>
+#include <asm/sections.h>
+#include <asm/set_memory.h>
+#include <asm/smp.h>
+#include <asm/suspend.h>
+
+#include <linux/cpu.h>
+#include <linux/memblock.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/suspend.h>
+#include <linux/utsname.h>
+
+/*
+ * The logical cpu number we should resume on, initialised to a non-cpu
+ * number.
+ */
+static int sleep_cpu = -EINVAL;
+
+/* CPU context to be saved */
+struct suspend_context *hibernate_cpu_context;
+
+unsigned long relocated_restore_code;
+
+/* Pointer to the temporary resume page tables */
+pgd_t *resume_pg_dir;
+
+/*
+ * Save the build number and date so that the we are not resume with a different kernel
+ */
+struct arch_hibernate_hdr_invariants {
+	char		uts_version[__NEW_UTS_LEN + 1];
+};
+
+/* Helper parameters that help us to restore the image.
+ * @hartid: To make sure same boot_cpu executing the hibernate/restore code.
+ * @saved_satp: Original page table used by the hibernated image.
+ * @restore_cpu_addr: The kernel's image address to restore the CPU context.
+ */
+static struct arch_hibernate_hdr {
+	struct arch_hibernate_hdr_invariants invariants;
+	unsigned long	hartid;
+	unsigned long	saved_satp;
+	unsigned long	restore_cpu_addr;
+} resume_hdr;
+
+static inline void arch_hdr_invariants(struct arch_hibernate_hdr_invariants *i)
+{
+	memset(i, 0, sizeof(*i));
+	memcpy(i->uts_version, init_utsname()->version, sizeof(i->uts_version));
+}
+
+/*
+ * Check if the given pfn is in the 'nosave' section.
+ */
+int pfn_is_nosave(unsigned long pfn)
+{
+	unsigned long nosave_begin_pfn = sym_to_pfn(&__nosave_begin);
+	unsigned long nosave_end_pfn = sym_to_pfn(&__nosave_end - 1);
+
+	return ((pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn));
+}
+
+void notrace save_processor_state(void)
+{
+	WARN_ON(num_online_cpus() != 1);
+}
+
+void notrace restore_processor_state(void)
+{
+}
+
+/*
+ * Helper parameters need to be saved to the hibernation image header.
+ */
+int arch_hibernation_header_save(void *addr, unsigned int max_size)
+{
+	struct arch_hibernate_hdr *hdr = addr;
+
+	if (max_size < sizeof(*hdr))
+		return -EOVERFLOW;
+
+	arch_hdr_invariants(&hdr->invariants);
+
+	hdr->hartid = cpuid_to_hartid_map(sleep_cpu);
+	hdr->saved_satp = csr_read(CSR_SATP);
+	hdr->restore_cpu_addr = (unsigned long) __hibernate_cpu_resume;
+
+	return 0;
+}
+EXPORT_SYMBOL(arch_hibernation_header_save);
+
+/*
+ * Retrieve the helper parameters from the hibernation image header
+ */
+int arch_hibernation_header_restore(void *addr)
+{
+	struct arch_hibernate_hdr_invariants invariants;
+	struct arch_hibernate_hdr *hdr = addr;
+	int ret = 0;
+
+	arch_hdr_invariants(&invariants);
+
+	if (memcmp(&hdr->invariants, &invariants, sizeof(invariants))) {
+		pr_crit("Hibernate image not generated by this kernel!\n");
+		return -EINVAL;
+	}
+
+	sleep_cpu = riscv_hartid_to_cpuid(hdr->hartid);
+	if (sleep_cpu < 0) {
+		pr_crit("Hibernated on a CPU not known to this kernel!\n");
+		sleep_cpu = -EINVAL;
+		return -EINVAL;
+	}
+
+#ifdef CONFIG_SMP
+	ret = bringup_hibernate_cpu(sleep_cpu);
+	if (ret) {
+		sleep_cpu = -EINVAL;
+		return ret;
+	}
+#endif
+	resume_hdr = *hdr;
+
+	return ret;
+}
+EXPORT_SYMBOL(arch_hibernation_header_restore);
+
+int swsusp_arch_suspend(void)
+{
+	int ret = 0;
+
+	if (__cpu_suspend_enter(hibernate_cpu_context)) {
+		sleep_cpu = smp_processor_id();
+		suspend_save_csrs(hibernate_cpu_context);
+		ret = swsusp_save();
+	} else {
+		suspend_restore_csrs(hibernate_cpu_context);
+		flush_tlb_all();
+
+		/* Invalidated Icache */
+		flush_icache_all();
+
+		/*
+		 * Tell the hibernation core that we've just restored
+		 * the memory
+		 */
+		in_suspend = 0;
+		sleep_cpu = -EINVAL;
+	}
+
+	return ret;
+}
+
+#define temp_pgtable_map_pgd_next(pgdp, vaddr, prot)			\
+		(pgtable_l5_enabled ?					\
+		temp_pgtable_map_p4d(pgdp, vaddr, prot) :		\
+		(pgtable_l4_enabled ?					\
+		temp_pgtable_map_pud((pud_t *)pgdp, vaddr, prot) :	\
+		temp_pgtable_map_pmd((pmd_t *)pgdp, vaddr, prot)))
+
+unsigned long temp_pgtable_map_pte(pte_t *ptep, unsigned long vaddr, pgprot_t prot)
+{
+	uintptr_t pte_idx = pte_index(vaddr);
+
+	ptep[pte_idx] = pfn_pte(PFN_DOWN(__pa(vaddr)), prot);
+
+	return 0;
+}
+
+unsigned long temp_pgtable_map_pmd(pmd_t *pmdp, unsigned long vaddr, pgprot_t prot)
+{
+	uintptr_t pmd_idx = pmd_index(vaddr);
+	pte_t *ptep;
+
+	if (pmd_none(pmdp[pmd_idx])) {
+		ptep = (pte_t *) get_safe_page(GFP_ATOMIC);
+		if (!ptep)
+			return -ENOMEM;
+
+		memset(ptep, 0, PAGE_SIZE);
+		pmdp[pmd_idx] = pfn_pmd(PFN_DOWN(__pa(ptep)), PAGE_TABLE);
+	} else {
+		ptep = (pte_t *) __va(PFN_PHYS(_pmd_pfn(pmdp[pmd_idx])));
+	}
+
+	return temp_pgtable_map_pte(ptep, vaddr, prot);
+}
+
+unsigned long temp_pgtable_map_pud(pud_t *pudp, unsigned long vaddr, pgprot_t prot)
+{
+
+	uintptr_t pud_index = pud_index(vaddr);
+	pmd_t *pmdp;
+
+	if (pud_val(pudp[pud_index]) == 0) {
+		pmdp = (pmd_t *) get_safe_page(GFP_ATOMIC);
+		if (!pmdp)
+			return -ENOMEM;
+
+		memset(pmdp, 0, PAGE_SIZE);
+		pudp[pud_index] = pfn_pud(PFN_DOWN(__pa(pmdp)), PAGE_TABLE);
+	} else {
+		pmdp = (pmd_t *) __va(PFN_PHYS(_pud_pfn(pudp[pud_index])));
+	}
+
+	return temp_pgtable_map_pmd(pmdp, vaddr, prot);
+}
+
+unsigned long temp_pgtable_map_p4d(p4d_t *p4dp, unsigned long vaddr, pgprot_t prot)
+{
+	uintptr_t p4d_index = p4d_index(vaddr);
+	pud_t *pudp;
+
+	if (p4d_val(p4dp[p4d_index]) == 0) {
+		pudp = (pud_t *) get_safe_page(GFP_ATOMIC);
+		if (!pudp)
+			return -ENOMEM;
+
+		memset(pudp, 0, PAGE_SIZE);
+		p4dp[p4d_index] = pfn_p4d(PFN_DOWN(__pa(pudp)), PAGE_TABLE);
+	} else {
+		pudp = (pud_t *) __va(PFN_PHYS(_p4d_pfn(p4dp[p4d_index])));
+	}
+
+	return temp_pgtable_map_pud(pudp, vaddr, prot);
+}
+
+unsigned long temp_pgtable_map_pgd(pgd_t *pgdp, unsigned long vaddr, pgprot_t prot)
+{
+	uintptr_t pgd_idx = pgd_index(vaddr);
+	void *nextp;
+
+	if (pgd_val(pgdp[pgd_idx]) == 0) {
+		nextp = (void *) get_safe_page(GFP_ATOMIC);
+		if (!nextp)
+			return -ENOMEM;
+
+		memset(nextp, 0, PAGE_SIZE);
+		pgdp[pgd_idx] = pfn_pgd(PFN_DOWN(__pa(nextp)), PAGE_TABLE);
+	} else {
+		nextp = (void *) __va(PFN_PHYS(_pgd_pfn(pgdp[pgd_idx])));
+	}
+
+	return temp_pgtable_map_pgd_next(nextp, vaddr, prot);
+}
+
+unsigned long temp_pgtable_mapping(pgd_t *pgdp, unsigned long vaddr, pgprot_t prot)
+{
+	return temp_pgtable_map_pgd(pgdp, vaddr, prot);
+}
+
+unsigned long relocate_restore_code(void)
+{
+	unsigned long ret;
+	void *page = (void *)get_safe_page(GFP_ATOMIC);
+
+	if (!page)
+		return -ENOMEM;
+
+	copy_page(page, core_restore_code);
+
+	/* Make the page containing the relocated code executable */
+	set_memory_x((unsigned long)page, 1);
+
+	ret = temp_pgtable_mapping(resume_pg_dir, (unsigned long)page, PAGE_KERNEL_READ_EXEC);
+	if (ret)
+		return ret;
+
+	return (unsigned long)page;
+}
+
+int swsusp_arch_resume(void)
+{
+	unsigned long addr = PAGE_OFFSET;
+	unsigned long ret;
+
+	/*
+	 * Memory allocated by get_safe_page() will be dealt with by the hibernation core,
+	 * we don't need to free it here.
+	 */
+	resume_pg_dir = (pgd_t *)get_safe_page(GFP_ATOMIC);
+	if (!resume_pg_dir)
+		return -ENOMEM;
+
+	/*
+	 * The pages need to be wrote-able when restoring the image.
+	 * Create a second copy of page table just for the linear map, and use this when
+	 * restoring.
+	 */
+	for (; addr <= (unsigned long)pfn_to_virt(max_low_pfn); addr += PAGE_SIZE) {
+		ret = temp_pgtable_mapping(resume_pg_dir, addr, PAGE_KERNEL);
+		if (ret)
+			return (int) ret;
+	}
+
+	/* Move the restore code to a new page so that it doesn't get overwritten by itself */
+	relocated_restore_code = relocate_restore_code();
+	if (relocated_restore_code == -ENOMEM)
+		return -ENOMEM;
+
+	/* Map the __hibernate_cpu_resume() address to the temporary page table so that the
+	 * restore code can jump to it after finished restore the image. The next execution
+	 * code doesn't find itself in a different address space after switching over to the
+	 * original page table used by the hibernated image.
+	 */
+	ret = temp_pgtable_mapping(resume_pg_dir, (unsigned long)resume_hdr.restore_cpu_addr,
+				PAGE_KERNEL_READ_EXEC);
+	if (ret)
+		return ret;
+
+	restore_image(resume_hdr.saved_satp, (PFN_DOWN(__pa(resume_pg_dir)) | satp_mode),
+			resume_hdr.restore_cpu_addr);
+
+	return 0;
+}
+
+#ifdef CONFIG_SMP
+int hibernate_resume_nonboot_cpu_disable(void)
+{
+	if (sleep_cpu < 0) {
+		pr_err("Failing to resume from hibernate on an unknown CPU.\n");
+		return -ENODEV;
+	}
+
+	return freeze_secondary_cpus(sleep_cpu);
+}
+#endif
+
+static int __init riscv_hibernate__init(void)
+{
+	hibernate_cpu_context = kcalloc(1, sizeof(struct suspend_context), GFP_KERNEL);
+
+	if (WARN_ON(!hibernate_cpu_context))
+		return -ENOMEM;
+
+	return 0;
+}
+
+early_initcall(riscv_hibernate__init);