Message ID | 201108032315.06012.rjw@sisk.pl (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Hi! > Freeze all filesystems during the freezing of tasks by calling > freeze_bdev() for each of them and thaw them during the thawing > of tasks with the help of thaw_bdev(). > > This is needed by hibernation, because some filesystems (e.g. XFS) > deadlock with the preallocation of memory used by it if the memory > pressure caused by it is too heavy. > > The additional benefit of this change is that, if something goes > wrong after filesystems have been frozen, they will stay in a > consistent state and journal replays won't be necessary (e.g. after > a failing suspend or resume). In particular, this should help to > solve a long-standing issue that in some cases during resume from > hibernation the boot loader causes the journal to be replied for the > filesystem containing the kernel image and initrd causing it to > become inconsistent with the information stored in the hibernation > image. > +/** > + * freeze_filesystems - Force all filesystems into a consistent state. > + */ > +void freeze_filesystems(void) > +{ > + struct super_block *sb; > + > + lockdep_off(); Ouch. So... why do we need to silence this? > + /* > + * Freeze in reverse order so filesystems dependant upon others are > + * frozen in the right order (eg. loopback on ext3). > + */ > + list_for_each_entry_reverse(sb, &super_blocks, s_list) { > + if (!sb->s_root || !sb->s_bdev || > + (sb->s_frozen == SB_FREEZE_TRANS) || > + (sb->s_flags & MS_RDONLY) || > + (sb->s_flags & MS_FROZEN)) > + continue; Should we stop NFS from modifying remote server, too? Plus... ext3 writes to read-only filesystems on mount; not sure if it does it later. But RDONLY means 'user cant write to it' not 'bdev will not be modified'. Should we freeze all? How can 'already frozen' happen? > + list_for_each_entry(sb, &super_blocks, s_list) > + if (sb->s_flags & MS_FROZEN) { > + sb->s_flags &= ~MS_FROZEN; > + thaw_bdev(sb->s_bdev, sb); > + } ...because we'll unfreeze it even if we did not freeze it... Pavel
On Wednesday, August 03, 2011, Pavel Machek wrote: > Hi! > > > Freeze all filesystems during the freezing of tasks by calling > > freeze_bdev() for each of them and thaw them during the thawing > > of tasks with the help of thaw_bdev(). > > > > This is needed by hibernation, because some filesystems (e.g. XFS) > > deadlock with the preallocation of memory used by it if the memory > > pressure caused by it is too heavy. > > > > The additional benefit of this change is that, if something goes > > wrong after filesystems have been frozen, they will stay in a > > consistent state and journal replays won't be necessary (e.g. after > > a failing suspend or resume). In particular, this should help to > > solve a long-standing issue that in some cases during resume from > > hibernation the boot loader causes the journal to be replied for the > > filesystem containing the kernel image and initrd causing it to > > become inconsistent with the information stored in the hibernation > > image. > > > +/** > > + * freeze_filesystems - Force all filesystems into a consistent state. > > + */ > > +void freeze_filesystems(void) > > +{ > > + struct super_block *sb; > > + > > + lockdep_off(); > > Ouch. So... why do we need to silence this? So that it doesn't complain? :-) I'll need some time to get the exact details here. > > + /* > > + * Freeze in reverse order so filesystems dependant upon others are > > + * frozen in the right order (eg. loopback on ext3). > > + */ > > + list_for_each_entry_reverse(sb, &super_blocks, s_list) { > > + if (!sb->s_root || !sb->s_bdev || > > + (sb->s_frozen == SB_FREEZE_TRANS) || > > + (sb->s_flags & MS_RDONLY) || > > + (sb->s_flags & MS_FROZEN)) > > + continue; > > Should we stop NFS from modifying remote server, too? What do you mean exactly? > Plus... ext3 writes to read-only filesystems on mount; not sure if it > does it later. But RDONLY means 'user cant write to it' not 'bdev will > not be modified'. Should we freeze all? > > How can 'already frozen' happen? > > > + list_for_each_entry(sb, &super_blocks, s_list) > > + if (sb->s_flags & MS_FROZEN) { > > + sb->s_flags &= ~MS_FROZEN; > > + thaw_bdev(sb->s_bdev, sb); > > + } > > ...because we'll unfreeze it even if we did not freeze it... So we need not check MS_FROZEN in freeze_filesystems(). OK Thanks, Rafael
On Thursday, August 04, 2011, Rafael J. Wysocki wrote: > On Wednesday, August 03, 2011, Pavel Machek wrote: > > Hi! > > > > > Freeze all filesystems during the freezing of tasks by calling > > > freeze_bdev() for each of them and thaw them during the thawing > > > of tasks with the help of thaw_bdev(). > > > > > > This is needed by hibernation, because some filesystems (e.g. XFS) > > > deadlock with the preallocation of memory used by it if the memory > > > pressure caused by it is too heavy. > > > > > > The additional benefit of this change is that, if something goes > > > wrong after filesystems have been frozen, they will stay in a > > > consistent state and journal replays won't be necessary (e.g. after > > > a failing suspend or resume). In particular, this should help to > > > solve a long-standing issue that in some cases during resume from > > > hibernation the boot loader causes the journal to be replied for the > > > filesystem containing the kernel image and initrd causing it to > > > become inconsistent with the information stored in the hibernation > > > image. > > > > > +/** > > > + * freeze_filesystems - Force all filesystems into a consistent state. > > > + */ > > > +void freeze_filesystems(void) > > > +{ > > > + struct super_block *sb; > > > + > > > + lockdep_off(); > > > > Ouch. So... why do we need to silence this? > > So that it doesn't complain? :-) > > I'll need some time to get the exact details here. So, this is because ext3_freeze() that doesn't call journal_unlock_updates() on success, which quite frankly looks like a bug in ext3 to me. At least that's different from what ext4 does in exactly the same situation (which looks correct). If ext3_freeze() called journal_unlock_updates() on success too and the call to journal_unlock_updates() is removed from ext3_unfreeze(), we wouldn't need that lockdep_off()/lockdep_on() around the loop. I need someone with ext3/ext4 knowledge to comment here, though. Moreover, I'm not sure if other filesystems don't do such things. Anyway, this is just a false-positive, even with the ext3 code as is. > > > + /* > > > + * Freeze in reverse order so filesystems dependant upon others are > > > + * frozen in the right order (eg. loopback on ext3). > > > + */ > > > + list_for_each_entry_reverse(sb, &super_blocks, s_list) { > > > + if (!sb->s_root || !sb->s_bdev || > > > + (sb->s_frozen == SB_FREEZE_TRANS) || > > > + (sb->s_flags & MS_RDONLY) || > > > + (sb->s_flags & MS_FROZEN)) > > > + continue; > > > > Should we stop NFS from modifying remote server, too? > > What do you mean exactly? > > > Plus... ext3 writes to read-only filesystems on mount; not sure if it > > does it later. But RDONLY means 'user cant write to it' not 'bdev will > > not be modified'. Should we freeze all? > > > > How can 'already frozen' happen? > > > > > + list_for_each_entry(sb, &super_blocks, s_list) > > > + if (sb->s_flags & MS_FROZEN) { > > > + sb->s_flags &= ~MS_FROZEN; > > > + thaw_bdev(sb->s_bdev, sb); > > > + } > > > > ...because we'll unfreeze it even if we did not freeze it... > > So we need not check MS_FROZEN in freeze_filesystems(). OK Thanks, Rafael
Index: linux-2.6/include/linux/fs.h =================================================================== --- linux-2.6.orig/include/linux/fs.h +++ linux-2.6/include/linux/fs.h @@ -211,6 +211,7 @@ struct inodes_stat_t { #define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */ #define MS_I_VERSION (1<<23) /* Update inode I_version field */ #define MS_STRICTATIME (1<<24) /* Always perform atime updates */ +#define MS_FROZEN (1<<25) /* bdev has been frozen */ #define MS_NOSEC (1<<28) #define MS_BORN (1<<29) #define MS_ACTIVE (1<<30) @@ -2047,6 +2048,8 @@ extern struct super_block *freeze_bdev(s extern void emergency_thaw_all(void); extern int thaw_bdev(struct block_device *bdev, struct super_block *sb); extern int fsync_bdev(struct block_device *); +extern void freeze_filesystems(void); +extern void thaw_filesystems(void); #else static inline void bd_forget(struct inode *inode) {} static inline int sync_blockdev(struct block_device *bdev) { return 0; } @@ -2061,6 +2064,9 @@ static inline int thaw_bdev(struct block { return 0; } + +static inline void freeze_filesystems(void) {} +static inline void thaw_filesystems(void) {} #endif extern int sync_filesystem(struct super_block *); extern const struct file_operations def_blk_fops; Index: linux-2.6/fs/block_dev.c =================================================================== --- linux-2.6.orig/fs/block_dev.c +++ linux-2.6/fs/block_dev.c @@ -314,6 +314,49 @@ out: } EXPORT_SYMBOL(thaw_bdev); +/** + * freeze_filesystems - Force all filesystems into a consistent state. + */ +void freeze_filesystems(void) +{ + struct super_block *sb; + + lockdep_off(); + /* + * Freeze in reverse order so filesystems dependant upon others are + * frozen in the right order (eg. loopback on ext3). + */ + list_for_each_entry_reverse(sb, &super_blocks, s_list) { + if (!sb->s_root || !sb->s_bdev || + (sb->s_frozen == SB_FREEZE_TRANS) || + (sb->s_flags & MS_RDONLY) || + (sb->s_flags & MS_FROZEN)) + continue; + + freeze_bdev(sb->s_bdev); + sb->s_flags |= MS_FROZEN; + } + lockdep_on(); +} + +/** + * thaw_filesystems - Make all filesystems active again. + */ +void thaw_filesystems(void) +{ + struct super_block *sb; + + lockdep_off(); + + list_for_each_entry(sb, &super_blocks, s_list) + if (sb->s_flags & MS_FROZEN) { + sb->s_flags &= ~MS_FROZEN; + thaw_bdev(sb->s_bdev, sb); + } + + lockdep_on(); +} + static int blkdev_writepage(struct page *page, struct writeback_control *wbc) { return block_write_full_page(page, blkdev_get_block, wbc); Index: linux-2.6/kernel/power/process.c =================================================================== --- linux-2.6.orig/kernel/power/process.c +++ linux-2.6/kernel/power/process.c @@ -12,10 +12,10 @@ #include <linux/oom.h> #include <linux/suspend.h> #include <linux/module.h> -#include <linux/syscalls.h> #include <linux/freezer.h> #include <linux/delay.h> #include <linux/workqueue.h> +#include <linux/fs.h> /* * Timeout for stopping processes @@ -147,6 +147,10 @@ int freeze_processes(void) goto Exit; printk("done.\n"); + pr_info("Freezing filesystems ... "); + freeze_filesystems(); + pr_info("done.\n"); + printk("Freezing remaining freezable tasks ... "); error = try_to_freeze_tasks(false); if (error) @@ -188,6 +192,7 @@ void thaw_processes(void) printk("Restarting tasks ... "); thaw_workqueues(); thaw_tasks(true); + thaw_filesystems(); thaw_tasks(false); schedule(); printk("done.\n");