diff mbox

[1/3] power, vfs: move away from PF_KTHREAD freezing in favor of fs freezing

Message ID alpine.LNX.2.00.1510301439080.17538@pobox.suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

Jiri Kosina Oct. 30, 2015, 1:47 p.m. UTC
From: Jiri Kosina <jkosina@suse.cz>

Freeze all filesystems during hibernation in favor of dropping kthread
freezing completely.

Kthread freezing has a history of not very well defined semantics.
Historically, it has been established to make sure that kthreads which are
helpers to writing out data to disk are frozen during hibernation at
well-defined state, so that it's guaranteed that they freeze only after all the
outstanding I/O made it to disk (userspace has already been frozen by that
time, so there is no new I/O being issued).

One of the issues with kthread freezer is that the places where try_to_freeze()
is called in kthreads is rather random / arbitrary. This stems mostly from
the fact that there is actually no good definition / list of requirements
that should be enforced on a frozen kthread (it's important to mention that
frozen kthread for example doesn't automatically imply that everything that
has been spawned asynchronously from it (such as timers) is stopped as well).

Basically the main argument why kthread freezer is not needed boils down to
this: the only facility that is needed during suspend: "no persistent fs
changes are allowed from now on".

	- this is something we get from fs freezing for free
	- Why do we issue all those try_to_freeze() calls in kthreads that
	  don't generate any I/O themselves at all?
	- Why do we issue all those try_to_freeze() calls in kthreads that
	  are actual I/O helpers? (if there is outstanding I/O, we *want*
	  it to happen before hibernating).

This patch removes freezing of kthreads during suspend, and issues a global
filesystem freeze instead.

We awoid freezing in-memory filesystems, because (a) they end up in the
image in a consistent state anyway (b) we will deadlock, as write to
/sys/power/state would never succeed.

The patch could have been made a bit nicer if generic iterate_supers()
could be used, but the locking (especially s_umount semantics) would have to
be redone, so that'd be something to postpone for later.
Also, the 'skip_virtual' flag is superfluous for now (as hibernation is the
only user of this facility for the time being), but I'd like to keep it
there to avoid further confusion regarding the fact that freeze_all_supers()
actually skips in-memory filesystems.

Based on prior work done by Rafael Wysocki.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 drivers/xen/manage.c     |  6 ----
 fs/super.c               | 84 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/freezer.h  |  2 --
 include/linux/fs.h       |  2 ++
 kernel/power/hibernate.c |  5 ---
 kernel/power/power.h     | 20 +-----------
 kernel/power/process.c   | 63 +++++++++---------------------------
 kernel/power/user.c      |  9 ------
 8 files changed, 103 insertions(+), 88 deletions(-)

Comments

kernel test robot Oct. 30, 2015, 2:04 p.m. UTC | #1
Hi Jiri,

[auto build test WARNING on v4.3-rc7 -- if it's inappropriate base, please suggest rules for selecting the more suitable base]

url:    https://github.com/0day-ci/linux/commits/Jiri-Kosina/PM-vfs-use-filesystem-freezing-instead-of-kthread-freezer/20151030-215223
config: x86_64-randconfig-x014-10300134 (attached as .config)
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   kernel/power/hibernate.c: In function 'hibernation_snapshot':
>> kernel/power/hibernate.c:401:2: warning: label 'Cleanup' defined but not used [-Wunused-label]
     Cleanup:
     ^

vim +/Cleanup +401 kernel/power/hibernate.c

64a473cb7 kernel/power/hibernate.c Rafael J. Wysocki 2009-07-08  385  		swsusp_free();
64a473cb7 kernel/power/hibernate.c Rafael J. Wysocki 2009-07-08  386  
91e7c75ba kernel/power/hibernate.c Rafael J. Wysocki 2011-05-17  387  	msg = in_suspend ? (error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE;
91e7c75ba kernel/power/hibernate.c Rafael J. Wysocki 2011-05-17  388  	dpm_resume(msg);
c9e664f1f kernel/power/hibernate.c Rafael J. Wysocki 2010-12-03  389  
c9e664f1f kernel/power/hibernate.c Rafael J. Wysocki 2010-12-03  390  	if (error || !in_suspend)
c9e664f1f kernel/power/hibernate.c Rafael J. Wysocki 2010-12-03  391  		pm_restore_gfp_mask();
c9e664f1f kernel/power/hibernate.c Rafael J. Wysocki 2010-12-03  392  
7777fab98 kernel/power/disk.c      Rafael J. Wysocki 2007-07-19  393  	resume_console();
91e7c75ba kernel/power/hibernate.c Rafael J. Wysocki 2011-05-17  394  	dpm_complete(msg);
91e7c75ba kernel/power/hibernate.c Rafael J. Wysocki 2011-05-17  395  
caea99ef3 kernel/power/disk.c      Rafael J. Wysocki 2008-01-08  396   Close:
caea99ef3 kernel/power/disk.c      Rafael J. Wysocki 2008-01-08  397  	platform_end(platform_mode);
7777fab98 kernel/power/disk.c      Rafael J. Wysocki 2007-07-19  398  	return error;
d8f3de0d2 kernel/power/disk.c      Rafael J. Wysocki 2008-06-12  399  
51d6ff7ac kernel/power/hibernate.c Srivatsa S. Bhat  2012-02-04  400   Thaw:
bb58dd5d1 kernel/power/hibernate.c Rafael J. Wysocki 2011-11-22 @401   Cleanup:
bb58dd5d1 kernel/power/hibernate.c Rafael J. Wysocki 2011-11-22  402  	swsusp_free();
bb58dd5d1 kernel/power/hibernate.c Rafael J. Wysocki 2011-11-22  403  	goto Close;
7777fab98 kernel/power/disk.c      Rafael J. Wysocki 2007-07-19  404  }
7777fab98 kernel/power/disk.c      Rafael J. Wysocki 2007-07-19  405  
7777fab98 kernel/power/disk.c      Rafael J. Wysocki 2007-07-19  406  /**
f42a9813f kernel/power/hibernate.c Rafael J. Wysocki 2011-05-24  407   * resume_target_kernel - Restore system state from a hibernation image.
f42a9813f kernel/power/hibernate.c Rafael J. Wysocki 2011-05-24  408   * @platform_mode: Whether or not to use the platform driver.
f42a9813f kernel/power/hibernate.c Rafael J. Wysocki 2011-05-24  409   *

:::::: The code at line 401 was first introduced by commit
:::::: bb58dd5d1ffad6c2d21c69698ba766dad4ae54e6 PM / Hibernate: Do not leak memory in error/test code paths

:::::: TO: Rafael J. Wysocki <rjw@sisk.pl>
:::::: CC: Rafael J. Wysocki <rjw@sisk.pl>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Oliver Neukum Oct. 31, 2015, 8:55 a.m. UTC | #2
On Fri, 2015-10-30 at 14:47 +0100, Jiri Kosina wrote:
> Basically the main argument why kthread freezer is not needed boils
> down to
> this: the only facility that is needed during suspend: "no persistent
> fs
> changes are allowed from now on".

Is that true? Drivers of character devices also may assume
that IO and suspend() wouldn't race (except in fairly clear exceptions)

	Regards
		Oliver


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
NeilBrown Nov. 2, 2015, 3:01 a.m. UTC | #3
On Sat, Oct 31 2015, Oliver Neukum wrote:

> On Fri, 2015-10-30 at 14:47 +0100, Jiri Kosina wrote:
>> Basically the main argument why kthread freezer is not needed boils
>> down to
>> this: the only facility that is needed during suspend: "no persistent
>> fs
>> changes are allowed from now on".
>
> Is that true? Drivers of character devices also may assume
> that IO and suspend() wouldn't race (except in fairly clear exceptions)

Anything that is a device can hook in to the suspend using the standard
call back and make sure no problematic races happen.

Filesystems are different partly because the "know" in memory some
details of what is persisting on storage, and partly because they aren't
devices.

Maybe filesystems should be devices .... though that probably wouldn't
help much.


I would suggest that "the only facility that is needed during suspend"
is really "well defined states and well defined transitions".
Filesystems need them as well as devices, and frozen kthreads might have
been a kludge to try to provide them.

NeilBrown
yalin wang Nov. 2, 2015, 7:54 a.m. UTC | #4
> On Oct 30, 2015, at 21:47, Jiri Kosina <jikos@kernel.org> wrote:
> 
> From: Jiri Kosina <jkosina@suse.cz>
> 
> Freeze all filesystems during hibernation in favor of dropping kthread
> freezing completely.
> 
> Kthread freezing has a history of not very well defined semantics.
> Historically, it has been established to make sure that kthreads which are
> helpers to writing out data to disk are frozen during hibernation at
> well-defined state, so that it's guaranteed that they freeze only after all the
> outstanding I/O made it to disk (userspace has already been frozen by that
> time, so there is no new I/O being issued).
> 
> One of the issues with kthread freezer is that the places where try_to_freeze()
> is called in kthreads is rather random / arbitrary. This stems mostly from
> the fact that there is actually no good definition / list of requirements
> that should be enforced on a frozen kthread (it's important to mention that
> frozen kthread for example doesn't automatically imply that everything that
> has been spawned asynchronously from it (such as timers) is stopped as well).
> 
> Basically the main argument why kthread freezer is not needed boils down to
> this: the only facility that is needed during suspend: "no persistent fs
> changes are allowed from now on".
> 
> 	- this is something we get from fs freezing for free
> 	- Why do we issue all those try_to_freeze() calls in kthreads that
> 	  don't generate any I/O themselves at all?
> 	- Why do we issue all those try_to_freeze() calls in kthreads that
> 	  are actual I/O helpers? (if there is outstanding I/O, we *want*
> 	  it to happen before hibernating).
> 
> This patch removes freezing of kthreads during suspend, and issues a global
> filesystem freeze instead.
> 
> We awoid freezing in-memory filesystems, because (a) they end up in the
> image in a consistent state anyway (b) we will deadlock, as write to
> /sys/power/state would never succeed.
> 
> The patch could have been made a bit nicer if generic iterate_supers()
> could be used, but the locking (especially s_umount semantics) would have to
> be redone, so that'd be something to postpone for later.
> Also, the 'skip_virtual' flag is superfluous for now (as hibernation is the
> only user of this facility for the time being), but I'd like to keep it
> there to avoid further confusion regarding the fact that freeze_all_supers()
> actually skips in-memory filesystems.
> 
> Based on prior work done by Rafael Wysocki.
> 
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> ---
> drivers/xen/manage.c     |  6 ----
> fs/super.c               | 84 ++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/freezer.h  |  2 --
> include/linux/fs.h       |  2 ++
> kernel/power/hibernate.c |  5 ---
> kernel/power/power.h     | 20 +-----------
> kernel/power/process.c   | 63 +++++++++---------------------------
> kernel/power/user.c      |  9 ------
> 8 files changed, 103 insertions(+), 88 deletions(-)
> 
> diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
> index e12bd36..d47716a 100644
> --- a/drivers/xen/manage.c
> +++ b/drivers/xen/manage.c
> @@ -109,12 +109,6 @@ static void do_suspend(void)
> 		goto out;
> 	}
> 
> -	err = freeze_kernel_threads();
> -	if (err) {
> -		pr_err("%s: freeze kernel threads failed %d\n", __func__, err);
> -		goto out_thaw;
> -	}
> -
> 	err = dpm_suspend_start(PMSG_FREEZE);
> 	if (err) {
> 		pr_err("%s: dpm_suspend_start %d\n", __func__, err);
> diff --git a/fs/super.c b/fs/super.c
> index 954aeb8..b7cb50f 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1373,3 +1373,87 @@ out:
> 	return 0;
> }
> EXPORT_SYMBOL(thaw_super);
> +
> +static bool super_should_freeze(struct super_block *sb, bool skip_virtual)
> +{
> +	if (!sb->s_root)
> +		return false;
> +	if (!(sb->s_flags & MS_BORN))
> +		return false;
> +	/* Should we freeze virtual filesystems? */
> +	if (sb->s_bdi == &noop_backing_dev_info && skip_virtual)
> +		return false;
> +	/* No need to freeze read-only filesystems */
> +	if (sb->s_flags & MS_RDONLY)
> +		return false;
> +	return true;
> +}
> +
> +/**
> + * freeze_all_supers -- iterate through all filesystems and freeze them
> + * @skip_virtual: should those with no backing device be skipped?
> + *
> + * Iterate over all superblocks and call freeze_super() for them. Some
> + * use-cases (such as freezer) might want to have to skip those which
> + * don't have any backing bdev.
> + *
> + */
> +int freeze_all_supers(bool skip_virtual)
> +{
> +	struct super_block *sb, *p = NULL;
> +	int error = 0;
> +
> +	spin_lock(&sb_lock);
> +	/*
> +	 * The list of super-blocks is iterated in a reverse order so that
> +	 * inter-dependencies (such as loopback devices) are handled in
> +	 * a non-deadlocking order
> +	 */
> +	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
> +		if (hlist_unhashed(&sb->s_instances))
> +			continue;
> +		sb->s_count++;
> +
> +		spin_unlock(&sb_lock);
> +		if (super_should_freeze(sb, skip_virtual)) {
> +			error = freeze_super(sb);
> +			if (error) {
> +				spin_lock(&sb_lock);
> +				break;
> +			}
> +		}
> +		spin_lock(&sb_lock);
> +
> +		if (p)
> +			__put_super(p);
> +		p = sb;
> +	}
> +	if (p)
> +		__put_super(p);
> +	spin_unlock(&sb_lock);
> +
> +	return error;
> +}
> +
> +void thaw_all_supers(void)
> +{
> +	struct super_block *sb, *p = NULL;
> +
> +	spin_lock(&sb_lock);
> +	list_for_each_entry(sb, &super_blocks, s_list) {
> +		if (hlist_unhashed(&sb->s_instances))
> +			continue;
> +		sb->s_count++;
> +
> +		spin_unlock(&sb_lock);
> +		thaw_super(sb);
> +		spin_lock(&sb_lock);
> +
> +		if (p)
> +			__put_super(p);
> +		p = sb;
> +	}
> +	if (p)
> +		__put_super(p);
> +	spin_unlock(&sb_lock);
> +}
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index 6b7fd9c..81335f6 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -43,9 +43,7 @@ extern void __thaw_task(struct task_struct *t);
> 
> extern bool __refrigerator(bool check_kthr_stop);
> extern int freeze_processes(void);
> -extern int freeze_kernel_threads(void);
> extern void thaw_processes(void);
> -extern void thaw_kernel_threads(void);
> 
> /*
>  * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 72d8a84..8ef2605 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2029,6 +2029,8 @@ extern int fd_statfs(int, struct kstatfs *);
> extern int vfs_ustat(dev_t, struct kstatfs *);
> extern int freeze_super(struct super_block *super);
> extern int thaw_super(struct super_block *super);
> +extern int freeze_all_supers(bool);
> +extern void thaw_all_supers(void);
> extern bool our_mnt(struct vfsmount *mnt);
> 
> extern int current_umask(void);
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index 690f78f..d5c36bb 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -348,10 +348,6 @@ int hibernation_snapshot(int platform_mode)
> 	if (error)
> 		goto Close;
> 
> -	error = freeze_kernel_threads();
> -	if (error)
> -		goto Cleanup;
> -
> 	if (hibernation_test(TEST_FREEZER)) {
> 
> 		/*
> @@ -402,7 +398,6 @@ int hibernation_snapshot(int platform_mode)
> 	return error;
> 
>  Thaw:
> -	thaw_kernel_threads();
>  Cleanup:
> 	swsusp_free();
> 	goto Close;
> diff --git a/kernel/power/power.h b/kernel/power/power.h
> index caadb56..4c3267f 100644
> --- a/kernel/power/power.h
> +++ b/kernel/power/power.h
> @@ -224,25 +224,7 @@ extern int pm_test_level;
> #ifdef CONFIG_SUSPEND_FREEZER
> static inline int suspend_freeze_processes(void)
> {
> -	int error;
> -
> -	error = freeze_processes();
> -	/*
> -	 * freeze_processes() automatically thaws every task if freezing
> -	 * fails. So we need not do anything extra upon error.
> -	 */
> -	if (error)
> -		return error;
> -
> -	error = freeze_kernel_threads();
> -	/*
> -	 * freeze_kernel_threads() thaws only kernel threads upon freezing
> -	 * failure. So we have to thaw the userspace tasks ourselves.
> -	 */
> -	if (error)
> -		thaw_processes();
> -
> -	return error;
> +	return freeze_processes();
> }
> 
> static inline void suspend_thaw_processes(void)
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 564f786..b1ad215 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -17,6 +17,7 @@
> #include <linux/delay.h>
> #include <linux/workqueue.h>
> #include <linux/kmod.h>
> +#include <linux/fs.h>
> #include <trace/events/power.h>
> 
> /* 
> @@ -140,6 +141,17 @@ int freeze_processes(void)
> 	pr_cont("\n");
> 	BUG_ON(in_atomic());
> 
> +	pr_info("Freezing filesystems ... ");
> +
> +	error = freeze_all_supers(true);
> +	if (error) {
> +		pr_cont("failed.\n");
> +		thaw_processes();
> +		return error;
> +	}
> +
> +	pr_cont("done.\n");
> +
> 	/*
> 	 * Now that the whole userspace is frozen we need to disbale
> 	 * the OOM killer to disallow any further interference with
> @@ -148,35 +160,10 @@ int freeze_processes(void)
> 	if (!error && !oom_killer_disable())
> 		error = -EBUSY;
> 
> -	if (error)
> +	if (error) {
> 		thaw_processes();
> -	return error;
> -}
> -
> -/**
> - * freeze_kernel_threads - Make freezable kernel threads go to the refrigerator.
> - *
> - * On success, returns 0.  On failure, -errno and only the kernel threads are
> - * thawed, so as to give a chance to the caller to do additional cleanups
> - * (if any) before thawing the userspace tasks. So, it is the responsibility
> - * of the caller to thaw the userspace tasks, when the time is right.
> - */
> -int freeze_kernel_threads(void)
> -{
> -	int error;
> -
> -	pr_info("Freezing remaining freezable tasks ... ");
> -
> -	pm_nosig_freezing = true;
> -	error = try_to_freeze_tasks(false);
> -	if (!error)
> -		pr_cont("done.");
> -
> -	pr_cont("\n");
> -	BUG_ON(in_atomic());
> -
> -	if (error)
> -		thaw_kernel_threads();
> +		thaw_all_supers();
> +	}
> 	return error;
> }
> 
> @@ -197,6 +184,7 @@ void thaw_processes(void)
> 
> 	__usermodehelper_set_disable_depth(UMH_FREEZING);
> 	thaw_workqueues();
> +	thaw_all_supers();
> 
> 	read_lock(&tasklist_lock);
> 	for_each_process_thread(g, p) {
> @@ -216,22 +204,3 @@ void thaw_processes(void)
> 	trace_suspend_resume(TPS("thaw_processes"), 0, false);
> }
> 
> -void thaw_kernel_threads(void)
> -{
> -	struct task_struct *g, *p;
> -
> -	pm_nosig_freezing = false;
> -	pr_info("Restarting kernel threads ... ");
> -
> -	thaw_workqueues();
> -
> -	read_lock(&tasklist_lock);
> -	for_each_process_thread(g, p) {
> -		if (p->flags & (PF_KTHREAD | PF_WQ_WORKER))
> -			__thaw_task(p);
> -	}
> -	read_unlock(&tasklist_lock);
> -
> -	schedule();
> -	pr_cont("done.\n");
> -}
> diff --git a/kernel/power/user.c b/kernel/power/user.c
> index 526e891..75771c0 100644
> --- a/kernel/power/user.c
> +++ b/kernel/power/user.c
> @@ -275,15 +275,6 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
> 		swsusp_free();
> 		memset(&data->handle, 0, sizeof(struct snapshot_handle));
> 		data->ready = false;
> -		/*
> -		 * It is necessary to thaw kernel threads here, because
> -		 * SNAPSHOT_CREATE_IMAGE may be invoked directly after
> -		 * SNAPSHOT_FREE.  In that case, if kernel threads were not
> -		 * thawed, the preallocation of memory carried out by
> -		 * hibernation_snapshot() might run into problems (i.e. it
> -		 * might fail or even deadlock).
> -		 */
> -		thaw_kernel_threads();
> 		break;
> 
> 	case SNAPSHOT_PREF_IMAGE_SIZE:
> -- 
> 1.9.2
do you test your patch on kthread_bind()  kernel thread ?
if you remove freeze_kernel_threads() function,
this means lots of pthread will be running status during suspend .
will have problem for bind thread , these thread will be migrate to other 
CPU , will have problem running code on other CPU, another problems is 
that the cpu_allowed_ptr is changed and will never be restore to original CPU mask .
this is not unsafe .
for example, if there is a thread like this :

kthread_bind()
while (!pthread_should_stop())
{
	do_something();

	try_to_freeze();
}

if remove try_to_freeze() , this thread maybe migrate to other CPU if the thread is running .
freeze kernel thread can make sure lots of kernel thread are not in runq during suspend ,
then we can avoid task migration.

Thanks
















--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Kosina Nov. 2, 2015, 11:05 a.m. UTC | #5
On Mon, 2 Nov 2015, yalin wang wrote:

> do you test your patch on kthread_bind()  kernel thread ?
> if you remove freeze_kernel_threads() function,
> this means lots of pthread will be running status during suspend .

pthreads? Again, userspace is frozen by that time.

> will have problem for bind thread , these thread will be migrate to other 
> CPU , will have problem running code on other CPU, another problems is 
> that the cpu_allowed_ptr is changed and will never be restore to original CPU mask .

Hmm, shouldn't that be fixed instead?

I mean, select_fallback_rq() will surely force a rq outside of the 
cpus_allowed if needed. I can imagine kthread bound to a particular CPU 
either wanting to keep itself affine to that CPU (and be scheduled out 
until CPU becomes online again), or it might want to actually be forced to 
another CPU and migrated back once "its own" CPU becomes available again.

But then yes, indeed, at the end of the day this is more or less what 
try_to_freeze() gives you. Fair enough, this might be one of cases where 
freezer for kthreads is actually useful (and would need to be documented 
to provide this semantics).

kthread CPU binding seems to be used by ~7 kthreads altogether in the 
kernel. Funily enough, quite some of them do not call try_to_freeze() at 
all. Which just demonstrates my point regarding how messy the current 
state of affairs is.
diff mbox

Patch

diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index e12bd36..d47716a 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -109,12 +109,6 @@  static void do_suspend(void)
 		goto out;
 	}
 
-	err = freeze_kernel_threads();
-	if (err) {
-		pr_err("%s: freeze kernel threads failed %d\n", __func__, err);
-		goto out_thaw;
-	}
-
 	err = dpm_suspend_start(PMSG_FREEZE);
 	if (err) {
 		pr_err("%s: dpm_suspend_start %d\n", __func__, err);
diff --git a/fs/super.c b/fs/super.c
index 954aeb8..b7cb50f 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1373,3 +1373,87 @@  out:
 	return 0;
 }
 EXPORT_SYMBOL(thaw_super);
+
+static bool super_should_freeze(struct super_block *sb, bool skip_virtual)
+{
+	if (!sb->s_root)
+		return false;
+	if (!(sb->s_flags & MS_BORN))
+		return false;
+	/* Should we freeze virtual filesystems? */
+	if (sb->s_bdi == &noop_backing_dev_info && skip_virtual)
+		return false;
+	/* No need to freeze read-only filesystems */
+	if (sb->s_flags & MS_RDONLY)
+		return false;
+	return true;
+}
+
+/**
+ * freeze_all_supers -- iterate through all filesystems and freeze them
+ * @skip_virtual: should those with no backing device be skipped?
+ *
+ * Iterate over all superblocks and call freeze_super() for them. Some
+ * use-cases (such as freezer) might want to have to skip those which
+ * don't have any backing bdev.
+ *
+ */
+int freeze_all_supers(bool skip_virtual)
+{
+	struct super_block *sb, *p = NULL;
+	int error = 0;
+
+	spin_lock(&sb_lock);
+	/*
+	 * The list of super-blocks is iterated in a reverse order so that
+	 * inter-dependencies (such as loopback devices) are handled in
+	 * a non-deadlocking order
+	 */
+	list_for_each_entry_reverse(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+
+		spin_unlock(&sb_lock);
+		if (super_should_freeze(sb, skip_virtual)) {
+			error = freeze_super(sb);
+			if (error) {
+				spin_lock(&sb_lock);
+				break;
+			}
+		}
+		spin_lock(&sb_lock);
+
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+
+	return error;
+}
+
+void thaw_all_supers(void)
+{
+	struct super_block *sb, *p = NULL;
+
+	spin_lock(&sb_lock);
+	list_for_each_entry(sb, &super_blocks, s_list) {
+		if (hlist_unhashed(&sb->s_instances))
+			continue;
+		sb->s_count++;
+
+		spin_unlock(&sb_lock);
+		thaw_super(sb);
+		spin_lock(&sb_lock);
+
+		if (p)
+			__put_super(p);
+		p = sb;
+	}
+	if (p)
+		__put_super(p);
+	spin_unlock(&sb_lock);
+}
diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 6b7fd9c..81335f6 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -43,9 +43,7 @@  extern void __thaw_task(struct task_struct *t);
 
 extern bool __refrigerator(bool check_kthr_stop);
 extern int freeze_processes(void);
-extern int freeze_kernel_threads(void);
 extern void thaw_processes(void);
-extern void thaw_kernel_threads(void);
 
 /*
  * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 72d8a84..8ef2605 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2029,6 +2029,8 @@  extern int fd_statfs(int, struct kstatfs *);
 extern int vfs_ustat(dev_t, struct kstatfs *);
 extern int freeze_super(struct super_block *super);
 extern int thaw_super(struct super_block *super);
+extern int freeze_all_supers(bool);
+extern void thaw_all_supers(void);
 extern bool our_mnt(struct vfsmount *mnt);
 
 extern int current_umask(void);
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 690f78f..d5c36bb 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -348,10 +348,6 @@  int hibernation_snapshot(int platform_mode)
 	if (error)
 		goto Close;
 
-	error = freeze_kernel_threads();
-	if (error)
-		goto Cleanup;
-
 	if (hibernation_test(TEST_FREEZER)) {
 
 		/*
@@ -402,7 +398,6 @@  int hibernation_snapshot(int platform_mode)
 	return error;
 
  Thaw:
-	thaw_kernel_threads();
  Cleanup:
 	swsusp_free();
 	goto Close;
diff --git a/kernel/power/power.h b/kernel/power/power.h
index caadb56..4c3267f 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -224,25 +224,7 @@  extern int pm_test_level;
 #ifdef CONFIG_SUSPEND_FREEZER
 static inline int suspend_freeze_processes(void)
 {
-	int error;
-
-	error = freeze_processes();
-	/*
-	 * freeze_processes() automatically thaws every task if freezing
-	 * fails. So we need not do anything extra upon error.
-	 */
-	if (error)
-		return error;
-
-	error = freeze_kernel_threads();
-	/*
-	 * freeze_kernel_threads() thaws only kernel threads upon freezing
-	 * failure. So we have to thaw the userspace tasks ourselves.
-	 */
-	if (error)
-		thaw_processes();
-
-	return error;
+	return freeze_processes();
 }
 
 static inline void suspend_thaw_processes(void)
diff --git a/kernel/power/process.c b/kernel/power/process.c
index 564f786..b1ad215 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -17,6 +17,7 @@ 
 #include <linux/delay.h>
 #include <linux/workqueue.h>
 #include <linux/kmod.h>
+#include <linux/fs.h>
 #include <trace/events/power.h>
 
 /* 
@@ -140,6 +141,17 @@  int freeze_processes(void)
 	pr_cont("\n");
 	BUG_ON(in_atomic());
 
+	pr_info("Freezing filesystems ... ");
+
+	error = freeze_all_supers(true);
+	if (error) {
+		pr_cont("failed.\n");
+		thaw_processes();
+		return error;
+	}
+
+	pr_cont("done.\n");
+
 	/*
 	 * Now that the whole userspace is frozen we need to disbale
 	 * the OOM killer to disallow any further interference with
@@ -148,35 +160,10 @@  int freeze_processes(void)
 	if (!error && !oom_killer_disable())
 		error = -EBUSY;
 
-	if (error)
+	if (error) {
 		thaw_processes();
-	return error;
-}
-
-/**
- * freeze_kernel_threads - Make freezable kernel threads go to the refrigerator.
- *
- * On success, returns 0.  On failure, -errno and only the kernel threads are
- * thawed, so as to give a chance to the caller to do additional cleanups
- * (if any) before thawing the userspace tasks. So, it is the responsibility
- * of the caller to thaw the userspace tasks, when the time is right.
- */
-int freeze_kernel_threads(void)
-{
-	int error;
-
-	pr_info("Freezing remaining freezable tasks ... ");
-
-	pm_nosig_freezing = true;
-	error = try_to_freeze_tasks(false);
-	if (!error)
-		pr_cont("done.");
-
-	pr_cont("\n");
-	BUG_ON(in_atomic());
-
-	if (error)
-		thaw_kernel_threads();
+		thaw_all_supers();
+	}
 	return error;
 }
 
@@ -197,6 +184,7 @@  void thaw_processes(void)
 
 	__usermodehelper_set_disable_depth(UMH_FREEZING);
 	thaw_workqueues();
+	thaw_all_supers();
 
 	read_lock(&tasklist_lock);
 	for_each_process_thread(g, p) {
@@ -216,22 +204,3 @@  void thaw_processes(void)
 	trace_suspend_resume(TPS("thaw_processes"), 0, false);
 }
 
-void thaw_kernel_threads(void)
-{
-	struct task_struct *g, *p;
-
-	pm_nosig_freezing = false;
-	pr_info("Restarting kernel threads ... ");
-
-	thaw_workqueues();
-
-	read_lock(&tasklist_lock);
-	for_each_process_thread(g, p) {
-		if (p->flags & (PF_KTHREAD | PF_WQ_WORKER))
-			__thaw_task(p);
-	}
-	read_unlock(&tasklist_lock);
-
-	schedule();
-	pr_cont("done.\n");
-}
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 526e891..75771c0 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -275,15 +275,6 @@  static long snapshot_ioctl(struct file *filp, unsigned int cmd,
 		swsusp_free();
 		memset(&data->handle, 0, sizeof(struct snapshot_handle));
 		data->ready = false;
-		/*
-		 * It is necessary to thaw kernel threads here, because
-		 * SNAPSHOT_CREATE_IMAGE may be invoked directly after
-		 * SNAPSHOT_FREE.  In that case, if kernel threads were not
-		 * thawed, the preallocation of memory carried out by
-		 * hibernation_snapshot() might run into problems (i.e. it
-		 * might fail or even deadlock).
-		 */
-		thaw_kernel_threads();
 		break;
 
 	case SNAPSHOT_PREF_IMAGE_SIZE: