Message ID | 20171002225218.18548-3-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Oct 02, 2017 at 03:52:12PM -0700, Bart Van Assche wrote: > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 3f7426120a3b..a2cf2a93b0cb 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -8961,6 +8963,37 @@ static void md_stop_all_writes(void) > mdelay(1000*1); > } > > +/* > + * Ensure that neither resyncing nor reshaping occurs while the system is > + * frozen. > + */ > +static int md_notify_pm(struct notifier_block *bl, unsigned long state, > + void *unused) > +{ > + struct mddev *mddev; > + struct list_head *tmp; > + > + pr_debug("%s: state = %ld; system_freezing_cnt = %d\n", __func__, state, > + atomic_read(&system_freezing_cnt)); > + > + switch (state) { > + case PM_HIBERNATION_PREPARE: Hm, why not also include and use this for PM_SUSPEND_PREPARE and/or a PM_RESTORE_PREPARE. case PM_HIBERNATION_PREPARE: case PM_SUSPEND_PREPARE: case PM_RESTORE_PREPARE: > + md_stop_all_writes(); > + break; > + case PM_POST_HIBERNATION: Likewise here: case PM_POST_SUSPEND: case PM_POST_HIBERNATION: case PM_POST_RESTORE: I have revised the kernel suspend ordering and indeed things issued with the pm notifier should suffice to be processed given we actually call the pm ops (dpm_prepare()) for device drivers *after* the notifier is called and then after userspace is frozen. That is: pm_suspend() --> enter_state() --> sys_sync() suspend_prepare() --> __pm_notifier_call_chain(PM_SUSPEND_PREPARE, ...); suspend_freeze_processes() --> freeze_processes() --> __usermodehelper_set_disable_depth(UMH_DISABLED); freeze all tasks ... freeze_kernel_threads() suspend_devices_and_enter() --> dpm_suspend_start() --> dpm_prepare() dpm_suspend() suspend_enter() --> platform_suspend_prepare() dpm_suspend_late() freeze_enter() syscore_suspend() On our way back up: enter_state() --> suspend_devices_and_enter() --> (bail from loop) dpm_resume_end() --> dpm_resume() dpm_complete() suspend_finish() --> suspend_thaw_processes() --> thaw_processes() --> __usermodehelper_set_disable_depth(UMH_FREEZING); thaw_workqueues(); thaw all processes ... usermodehelper_enable(); pm_notifier_call_chain(PM_POST_SUSPEND); So notifier would indeed be the right tooling to use here. Luis
On Mon, Oct 02, 2017 at 03:52:12PM -0700, Bart Van Assche wrote: > Some people use the md driver on laptops and use the suspend and > resume functionality. Since it is essential that submitting of > new I/O requests stops before a hibernation image is created, > interrupt the md resync and reshape actions if the system is > being frozen. Note: the resync and reshape will restart after > the system is resumed and a message similar to the following > will appear in the system log: > > md: md0: data-check interrupted. > > Additionally, only allow resync or reshape activity to be > started after system freezing is over. Last time, I asked which issue is fixed by this patch, but never get answered, so again, what is issue fixed by this patch? Given the following patches for making SCSI quiesce safe is required in either PM suspend/resume or other path of transport_spi revalidate. So once theses patches are in, I don't see any I/O hang in suspend/resume given the SCSI quiesce issue is very specific one, and not see this issue on other kind of device. Thanks, Ming
On Wed, 2017-10-04 at 19:30 +0800, Ming Lei wrote: > Last time, I asked which issue is fixed by this patch, but > never get answered, so again, what is issue fixed by > this patch? That's not correct. I did answer your question. See e.g. https://marc.info/?l=linux-block&m=150695879309606. > Given the following patches for making SCSI quiesce safe > is required in either PM suspend/resume or other path > of transport_spi revalidate. So once theses patches are > in, I don't see any I/O hang in suspend/resume given > the SCSI quiesce issue is very specific one, and not > see this issue on other kind of device. There are many more storage drivers and filesystems than the md driver that can submit I/O after user space processes have been frozen, e.g. the BTRFS RAID and scrub functionalities. It will take time to fix all these drivers and also to add filesystem freeze/thaw functionality to the suspend and resume sequences. As long as not all drivers have been fixed we will need SCSI device quiescing and resuming as a workaround to avoid that any I/O occurs while a hibernation image is being created. Bart.
On Wed, 2017-10-04 at 03:16 +0200, Luis R. Rodriguez wrote: > On Mon, Oct 02, 2017 at 03:52:12PM -0700, Bart Van Assche wrote: > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index 3f7426120a3b..a2cf2a93b0cb 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -8961,6 +8963,37 @@ static void md_stop_all_writes(void) > > mdelay(1000*1); > > } > > > > +/* > > + * Ensure that neither resyncing nor reshaping occurs while the system is > > + * frozen. > > + */ > > +static int md_notify_pm(struct notifier_block *bl, unsigned long state, > > + void *unused) > > +{ > > + struct mddev *mddev; > > + struct list_head *tmp; > > + > > + pr_debug("%s: state = %ld; system_freezing_cnt = %d\n", __func__, state, > > + atomic_read(&system_freezing_cnt)); > > + > > + switch (state) { > > + case PM_HIBERNATION_PREPARE: > > Hm, why not also include and use this for PM_SUSPEND_PREPARE and/or > a PM_RESTORE_PREPARE. > > case PM_HIBERNATION_PREPARE: > case PM_SUSPEND_PREPARE: > case PM_RESTORE_PREPARE: Thanks for the feedback Luis. I will make sure that resync activity is also stopped for the PM_SUSPEND_PREPARE and PM_RESTORE_PREPARE cases. Bart.
diff --git a/drivers/md/md.c b/drivers/md/md.c index 3f7426120a3b..a2cf2a93b0cb 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -66,6 +66,7 @@ #include <linux/raid/md_u.h> #include <linux/slab.h> #include <linux/percpu-refcount.h> +#include <linux/suspend.h> #include <trace/events/block.h> #include "md.h" @@ -7450,7 +7451,8 @@ static int md_thread(void *arg) wait_event_interruptible_timeout (thread->wqueue, - test_bit(THREAD_WAKEUP, &thread->flags) + (test_bit(THREAD_WAKEUP, &thread->flags) && + atomic_read(&system_freezing_cnt) == 0) || kthread_should_stop() || kthread_should_park(), thread->timeout); @@ -8961,6 +8963,37 @@ static void md_stop_all_writes(void) mdelay(1000*1); } +/* + * Ensure that neither resyncing nor reshaping occurs while the system is + * frozen. + */ +static int md_notify_pm(struct notifier_block *bl, unsigned long state, + void *unused) +{ + struct mddev *mddev; + struct list_head *tmp; + + pr_debug("%s: state = %ld; system_freezing_cnt = %d\n", __func__, state, + atomic_read(&system_freezing_cnt)); + + switch (state) { + case PM_HIBERNATION_PREPARE: + md_stop_all_writes(); + break; + case PM_POST_HIBERNATION: + rcu_read_lock(); + for_each_mddev(mddev, tmp) + wake_up(&mddev->thread->wqueue); + rcu_read_unlock(); + break; + } + return NOTIFY_DONE; +} + +static struct notifier_block md_pm_notifier = { + .notifier_call = md_notify_pm, +}; + static int md_notify_reboot(struct notifier_block *this, unsigned long code, void *x) { @@ -9007,6 +9040,7 @@ static int __init md_init(void) md_probe, NULL, NULL); register_reboot_notifier(&md_reboot_notifier); + register_pm_notifier(&md_pm_notifier); raid_table_header = register_sysctl_table(raid_root_table); md_geninit(); @@ -9246,6 +9280,7 @@ static __exit void md_exit(void) unregister_blkdev(MD_MAJOR,"md"); unregister_blkdev(mdp_major, "mdp"); + unregister_pm_notifier(&md_pm_notifier); unregister_reboot_notifier(&md_reboot_notifier); unregister_sysctl_table(raid_table_header);
Some people use the md driver on laptops and use the suspend and resume functionality. Since it is essential that submitting of new I/O requests stops before a hibernation image is created, interrupt the md resync and reshape actions if the system is being frozen. Note: the resync and reshape will restart after the system is resumed and a message similar to the following will appear in the system log: md: md0: data-check interrupted. Additionally, only allow resync or reshape activity to be started after system freezing is over. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Shaohua Li <shli@kernel.org> Cc: linux-raid@vger.kernel.org Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> --- drivers/md/md.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-)