Message ID | 1393486229-72034-9-git-send-email-hare@suse.de (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Mike Snitzer |
Headers | show |
On 02/27/14 16:30, Hannes Reinecke wrote: > activate_path() is run without a lock, so the path might be > set to failed before activate_path() had a chance to run. > > Signed-off-by: Hannes Reinecke <hare@suse.de> > --- > drivers/md/dm-mpath.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > index 0a40fa9..22e7365 100644 > --- a/drivers/md/dm-mpath.c > +++ b/drivers/md/dm-mpath.c > @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work) > struct pgpath *pgpath = > container_of(work, struct pgpath, activate_path.work); > > - scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), > - pg_init_done, pgpath); > + if (pgpath->is_active) > + scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), > + pg_init_done, pgpath); > + else > + pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); Hi Hannes, What problem are you going to fix with this? It is still possible that the path is set to failed just after "if (pgpath->is_active)" and before "scsi_dh_activate".
On 02/28/2014 09:49 AM, Junichi Nomura wrote: > On 02/27/14 16:30, Hannes Reinecke wrote: >> activate_path() is run without a lock, so the path might be >> set to failed before activate_path() had a chance to run. >> >> Signed-off-by: Hannes Reinecke <hare@suse.de> >> --- >> drivers/md/dm-mpath.c | 7 +++++-- >> 1 file changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c >> index 0a40fa9..22e7365 100644 >> --- a/drivers/md/dm-mpath.c >> +++ b/drivers/md/dm-mpath.c >> @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work) >> struct pgpath *pgpath = >> container_of(work, struct pgpath, activate_path.work); >> >> - scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >> - pg_init_done, pgpath); >> + if (pgpath->is_active) >> + scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >> + pg_init_done, pgpath); >> + else >> + pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); > > Hi Hannes, > > What problem are you going to fix with this? > It is still possible that the path is set to failed just after > "if (pgpath->is_active)" and before "scsi_dh_activate". > activate_path() is run from a workqueue, so there is a race window between the check ->is_active in __pg_init_all_paths() and the time the scsi_dh_activate() is executed. And as activate_path)() is run without any locks we don't have any synchronization with fail_path(). So it's well possible that a path gets failed in between __pg_init_all_paths() and activate_paths(). Granted, we could remove the check for ->is_active in __pg_init_all_paths(), but I wanted to avoid scheduling a workqueue item which is known to be failing. Cheers, Hannes
On 02/28/14 18:32, Hannes Reinecke wrote: > On 02/28/2014 09:49 AM, Junichi Nomura wrote: >> On 02/27/14 16:30, Hannes Reinecke wrote: >>> activate_path() is run without a lock, so the path might be >>> set to failed before activate_path() had a chance to run. >>> >>> Signed-off-by: Hannes Reinecke <hare@suse.de> >>> --- >>> drivers/md/dm-mpath.c | 7 +++++-- >>> 1 file changed, 5 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c >>> index 0a40fa9..22e7365 100644 >>> --- a/drivers/md/dm-mpath.c >>> +++ b/drivers/md/dm-mpath.c >>> @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work) >>> struct pgpath *pgpath = >>> container_of(work, struct pgpath, activate_path.work); >>> >>> - scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >>> - pg_init_done, pgpath); >>> + if (pgpath->is_active) >>> + scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >>> + pg_init_done, pgpath); >>> + else >>> + pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); >> >> Hi Hannes, >> >> What problem are you going to fix with this? >> It is still possible that the path is set to failed just after >> "if (pgpath->is_active)" and before "scsi_dh_activate". >> > activate_path() is run from a workqueue, so there is a race window > between the check ->is_active in __pg_init_all_paths() and the time > the scsi_dh_activate() is executed. And as activate_path)() is run > without any locks we don't have any synchronization with fail_path(). > So it's well possible that a path gets failed in between > __pg_init_all_paths() and activate_paths(). What I pointed is that your patch makes the window very small but doesn't close it. If the race is a problem, this patch doesn't fix the problem.
On 02/28/2014 10:37 AM, Junichi Nomura wrote: > On 02/28/14 18:32, Hannes Reinecke wrote: >> On 02/28/2014 09:49 AM, Junichi Nomura wrote: >>> On 02/27/14 16:30, Hannes Reinecke wrote: >>>> activate_path() is run without a lock, so the path might be >>>> set to failed before activate_path() had a chance to run. >>>> >>>> Signed-off-by: Hannes Reinecke <hare@suse.de> >>>> --- >>>> drivers/md/dm-mpath.c | 7 +++++-- >>>> 1 file changed, 5 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c >>>> index 0a40fa9..22e7365 100644 >>>> --- a/drivers/md/dm-mpath.c >>>> +++ b/drivers/md/dm-mpath.c >>>> @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work) >>>> struct pgpath *pgpath = >>>> container_of(work, struct pgpath, activate_path.work); >>>> >>>> - scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >>>> - pg_init_done, pgpath); >>>> + if (pgpath->is_active) >>>> + scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >>>> + pg_init_done, pgpath); >>>> + else >>>> + pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); >>> >>> Hi Hannes, >>> >>> What problem are you going to fix with this? >>> It is still possible that the path is set to failed just after >>> "if (pgpath->is_active)" and before "scsi_dh_activate". >>> >> activate_path() is run from a workqueue, so there is a race window >> between the check ->is_active in __pg_init_all_paths() and the time >> the scsi_dh_activate() is executed. And as activate_path)() is run >> without any locks we don't have any synchronization with fail_path(). >> So it's well possible that a path gets failed in between >> __pg_init_all_paths() and activate_paths(). > > What I pointed is that your patch makes the window very small > but doesn't close it. > If the race is a problem, this patch doesn't fix the problem. > True. As said I just want to avoid unnecessary overhead by calling functions which are known to be failing. This patch actually makes more sense in combination with the 'accept failed paths' patch, where it's much more likely to trigger. So if you insist I could move it over to there. Cheers, Hannes
On 02/28/14 18:52, Hannes Reinecke wrote: > On 02/28/2014 10:37 AM, Junichi Nomura wrote: >> On 02/28/14 18:32, Hannes Reinecke wrote: >>> On 02/28/2014 09:49 AM, Junichi Nomura wrote: >>>> On 02/27/14 16:30, Hannes Reinecke wrote: >>>>> activate_path() is run without a lock, so the path might be >>>>> set to failed before activate_path() had a chance to run. >>>>> >>>>> Signed-off-by: Hannes Reinecke <hare@suse.de> >>>>> --- >>>>> drivers/md/dm-mpath.c | 7 +++++-- >>>>> 1 file changed, 5 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c >>>>> index 0a40fa9..22e7365 100644 >>>>> --- a/drivers/md/dm-mpath.c >>>>> +++ b/drivers/md/dm-mpath.c >>>>> @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work) >>>>> struct pgpath *pgpath = >>>>> container_of(work, struct pgpath, activate_path.work); >>>>> >>>>> - scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >>>>> - pg_init_done, pgpath); >>>>> + if (pgpath->is_active) >>>>> + scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), >>>>> + pg_init_done, pgpath); >>>>> + else >>>>> + pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); >>>> >>>> Hi Hannes, >>>> >>>> What problem are you going to fix with this? >>>> It is still possible that the path is set to failed just after >>>> "if (pgpath->is_active)" and before "scsi_dh_activate". >>>> >>> activate_path() is run from a workqueue, so there is a race window >>> between the check ->is_active in __pg_init_all_paths() and the time >>> the scsi_dh_activate() is executed. And as activate_path)() is run >>> without any locks we don't have any synchronization with fail_path(). >>> So it's well possible that a path gets failed in between >>> __pg_init_all_paths() and activate_paths(). >> >> What I pointed is that your patch makes the window very small >> but doesn't close it. >> If the race is a problem, this patch doesn't fix the problem. >> > True. As said I just want to avoid unnecessary overhead by calling > functions which are known to be failing. "to avoid unnecessary overhead by calling functions which are known to be failing" IMO, that should be in the patch description. :) > This patch actually makes more sense in combination with the 'accept > failed paths' patch, where it's much more likely to trigger. > So if you insist I could move it over to there. If the description is fixed, I have no problem with this patch.
On Fri, Feb 28 2014 at 4:52am -0500, Hannes Reinecke <hare@suse.de> wrote: > On 02/28/2014 10:37 AM, Junichi Nomura wrote: > > On 02/28/14 18:32, Hannes Reinecke wrote: > >> On 02/28/2014 09:49 AM, Junichi Nomura wrote: > >>> On 02/27/14 16:30, Hannes Reinecke wrote: > >>>> activate_path() is run without a lock, so the path might be > >>>> set to failed before activate_path() had a chance to run. > >>>> > >>>> Signed-off-by: Hannes Reinecke <hare@suse.de> > >>>> --- > >>>> drivers/md/dm-mpath.c | 7 +++++-- > >>>> 1 file changed, 5 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > >>>> index 0a40fa9..22e7365 100644 > >>>> --- a/drivers/md/dm-mpath.c > >>>> +++ b/drivers/md/dm-mpath.c > >>>> @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work) > >>>> struct pgpath *pgpath = > >>>> container_of(work, struct pgpath, activate_path.work); > >>>> > >>>> - scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), > >>>> - pg_init_done, pgpath); > >>>> + if (pgpath->is_active) > >>>> + scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), > >>>> + pg_init_done, pgpath); > >>>> + else > >>>> + pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); > >>> > >>> Hi Hannes, > >>> > >>> What problem are you going to fix with this? > >>> It is still possible that the path is set to failed just after > >>> "if (pgpath->is_active)" and before "scsi_dh_activate". > >>> > >> activate_path() is run from a workqueue, so there is a race window > >> between the check ->is_active in __pg_init_all_paths() and the time > >> the scsi_dh_activate() is executed. And as activate_path)() is run > >> without any locks we don't have any synchronization with fail_path(). > >> So it's well possible that a path gets failed in between > >> __pg_init_all_paths() and activate_paths(). > > > > What I pointed is that your patch makes the window very small > > but doesn't close it. > > If the race is a problem, this patch doesn't fix the problem. > > > True. As said I just want to avoid unnecessary overhead by calling > functions which are known to be failing. > This patch actually makes more sense in combination with the 'accept > failed paths' patch, where it's much more likely to trigger. FYI, I still intend to review/take your "accept failed paths" patch. Would be helpful if any rebase is needed for that patch that you do so and re-post. One thing I noticed is you're only converting MAJ:MIN paths to devices. I think we should factor a helper out of DM core that does the full path lookup check from dm_get_device() -- rather than you open coding an older version of the MAJ:MIN device path processing. But is there a reason for not using lookup_bdev()? Because the device is failed it cannot be found using lookup_bdev()? -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 02/28/2014 03:22 PM, Mike Snitzer wrote: [ .. ] > > FYI, I still intend to review/take your "accept failed paths" patch. > Would be helpful if any rebase is needed for that patch that you do so > and re-post. > > One thing I noticed is you're only converting MAJ:MIN paths to devices. > I think we should factor a helper out of DM core that does the full path > lookup check from dm_get_device() -- rather than you open coding an > older version of the MAJ:MIN device path processing. > > But is there a reason for not using lookup_bdev()? Because the device > is failed it cannot be found using lookup_bdev()? > Yes, that's precisely it. When setting dev_loss_tmo very aggressively (to, say, 10 or even 5 seconds) it might well be that multipathd won't be able to keep up with the events in time. So by the time multipathd tries to push a new table into the kernel (which contains major:minor numbers only) those devices are already gone. So lookup_bdev() won't be able to find them, either. Not sure if factoring things out from dm_get_device() would help here ... But I'll be sending an updated patch, which'll apply on top of the 'push back' patchset. Cheers, Hannes
On Fri, Feb 28 2014 at 9:44am -0500, Hannes Reinecke <hare@suse.de> wrote: > On 02/28/2014 03:22 PM, Mike Snitzer wrote: > [ .. ] > > > > FYI, I still intend to review/take your "accept failed paths" patch. > > Would be helpful if any rebase is needed for that patch that you do so > > and re-post. > > > > One thing I noticed is you're only converting MAJ:MIN paths to devices. > > I think we should factor a helper out of DM core that does the full path > > lookup check from dm_get_device() -- rather than you open coding an > > older version of the MAJ:MIN device path processing. > > > > But is there a reason for not using lookup_bdev()? Because the device > > is failed it cannot be found using lookup_bdev()? > > > Yes, that's precisely it. > When setting dev_loss_tmo very aggressively (to, say, 10 or even 5 > seconds) it might well be that multipathd won't be able to keep up > with the events in time. > So by the time multipathd tries to push a new table into the kernel > (which contains major:minor numbers only) those devices are already > gone. So lookup_bdev() won't be able to find them, either. Been talking this over with Alasdair. Need some things clarified. Are you needing to handle non-existent devices during initial mpath table load? Or is the failed path in question already part of the active mpath table -- so active table currently holds a reference to the device? Or do you need to support both cases? Sounds like you want to support both cases.. > Not sure if factoring things out from dm_get_device() would help > here ... > > But I'll be sending an updated patch, which'll apply on top of the > 'push back' patchset. Thanks. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 02/28/2014 04:02 PM, Mike Snitzer wrote: > On Fri, Feb 28 2014 at 9:44am -0500, > Hannes Reinecke <hare@suse.de> wrote: > >> On 02/28/2014 03:22 PM, Mike Snitzer wrote: >> [ .. ] >>> >>> FYI, I still intend to review/take your "accept failed paths" patch. >>> Would be helpful if any rebase is needed for that patch that you do so >>> and re-post. >>> >>> One thing I noticed is you're only converting MAJ:MIN paths to devices. >>> I think we should factor a helper out of DM core that does the full path >>> lookup check from dm_get_device() -- rather than you open coding an >>> older version of the MAJ:MIN device path processing. >>> >>> But is there a reason for not using lookup_bdev()? Because the device >>> is failed it cannot be found using lookup_bdev()? >>> >> Yes, that's precisely it. >> When setting dev_loss_tmo very aggressively (to, say, 10 or even 5 >> seconds) it might well be that multipathd won't be able to keep up >> with the events in time. >> So by the time multipathd tries to push a new table into the kernel >> (which contains major:minor numbers only) those devices are already >> gone. So lookup_bdev() won't be able to find them, either. > > Been talking this over with Alasdair. Need some things clarified. > > Are you needing to handle non-existent devices during initial mpath > table load? > > Or is the failed path in question already part of the active mpath table > -- so active table currently holds a reference to the device? > > Or do you need to support both cases? Sounds like you want to support > both cases.. > Hmm. I wouldn't think I'd need it during initial mpath table load; but then you never now, devices might fail faster than you'd think ... So yeah, I'd need to support both cases. The main reason why I've sent this patch is to get rid of these _really_ annoying messages 'error getting device' during failover. We really should and can do better than that. That this patch allows to drop references to failed devices earlier is a nice side effect :-) Cheers, Hannes
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 0a40fa9..22e7365 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work) struct pgpath *pgpath = container_of(work, struct pgpath, activate_path.work); - scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), - pg_init_done, pgpath); + if (pgpath->is_active) + scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev), + pg_init_done, pgpath); + else + pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED); } static int noretry_error(int error)
activate_path() is run without a lock, so the path might be set to failed before activate_path() had a chance to run. Signed-off-by: Hannes Reinecke <hare@suse.de> --- drivers/md/dm-mpath.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)