Message ID | 20170512131508.3231-4-mwilck@suse.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
On Fri, 2017-05-12 at 15:15 +0200, Martin Wilck wrote: > alua_rtpg() can race with alua_bus_detach(). The assertion that > alua_dh_data *h->sdev must be non-NULL is not guaranteed because > alua_bus_detach sets this field to NULL before removing the entry > from the port group's dh_list. > > This happens when a device is about to be removed, so don't BUG out > but continue silently. > > Signed-off-by: Martin Wilck <mwilck@suse.com> > Reviewed-by: Hannes Reinecke <hare@suse.com> > --- > drivers/scsi/device_handler/scsi_dh_alua.c | 20 ++++++++++++++------ > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > index 2b60f493f90e..a59783020c66 100644 > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > @@ -652,9 +652,15 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > rcu_read_lock(); > list_for_each_entry_rcu(h, > &tmp_pg->dh_list, node) { > - /* h->sdev should always be valid */ > - BUG_ON(!h->sdev); > - h->sdev->access_state = desc[0]; > + /* > + * We might be racing with > + * alua_bus_detach here > + */ > + struct scsi_device *sdev = > + h->sdev; > + if (sdev) > + sdev->access_state = > + desc[0]; > } > rcu_read_unlock(); > } > @@ -694,11 +700,13 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > pg->expiry = 0; > rcu_read_lock(); > list_for_each_entry_rcu(h, &pg->dh_list, node) { > - BUG_ON(!h->sdev); > - h->sdev->access_state = > + struct scsi_device *sdev = h->sdev; > + if (!sdev) > + continue; > + sdev->access_state = > (pg->state & SCSI_ACCESS_STATE_MASK); > if (pg->pref) > - h->sdev->access_state |= > + sdev->access_state |= > SCSI_ACCESS_STATE_PREFERRED; > } > rcu_read_unlock(); Hello Martin, Allowing races like the one this patch tries to address to exist makes the ALUA code harder to maintain than necessary. Have you considered to make alua_bus_detach() wait until ALUA work has finished by using e.g. cancel_work_sync() or rcu_synchronize()? Bart.
On Fri, 2017-05-12 at 16:24 +0000, Bart Van Assche wrote: > > Hello Martin, > > Allowing races like the one this patch tries to address to exist > makes > the ALUA code harder to maintain than necessary. Have you considered > to > make alua_bus_detach() wait until ALUA work has finished by using > e.g. > cancel_work_sync() or rcu_synchronize()? > > Bart. Hello Bart, to be honest, no, I didn't consider this yet. The current kernel crashes with BUG() if an ALUA device is detached at an inopportune point in time (not just theoretically, we actually observed this). The goal of my patch was to fix this with minimum risk to introduce other problems. The addition in patch 4/4 was an attempt to address the concern you had expressed in your review of the v1 patch. I'm not opposed to try to find a better solution, but could we maybe get the fix for the BUG() (i.e. patch 3/4) applied in the first place? AFAICS it would not conflict with a solution like the one you suggested. Best regards and thanks for the review, Martin
On Mon, 2017-05-15 at 10:16 +0200, Martin Wilck wrote: > On Fri, 2017-05-12 at 16:24 +0000, Bart Van Assche wrote: > > Allowing races like the one this patch tries to address to exist > > makes the ALUA code harder to maintain than necessary. Have you > > considered to make alua_bus_detach() wait until ALUA work has > > finished by using e.g. cancel_work_sync() or rcu_synchronize()? > > to be honest, no, I didn't consider this yet. The current kernel > crashes with BUG() if an ALUA device is detached at an inopportune > point in time (not just theoretically, we actually observed this). The > goal of my patch was to fix this with minimum risk to introduce other > problems. The addition in patch 4/4 was an attempt to address the > concern you had expressed in your review of the v1 patch. > > I'm not opposed to try to find a better solution, but could we maybe > get the fix for the BUG() (i.e. patch 3/4) applied in the first place? > AFAICS it would not conflict with a solution like the one you > suggested. Hello Martin, Sorry but I don't think it's a good idea to merge patch 3/4 in the upstream kernel. Even with that patch applied there is nothing that prevents that h->handler_data would be freed while alua_rtpg() is in progress and hence that h->sdev is a completely random pointer if alua_rtpg() is executed concurrently with alua_bus_detach(). Please do not try to paper over race conditions but fix these properly. Bart.
On Mon, 2017-05-15 at 16:03 +0000, Bart Van Assche wrote: > On Mon, 2017-05-15 at 10:16 +0200, Martin Wilck wrote: > > On Fri, 2017-05-12 at 16:24 +0000, Bart Van Assche wrote: > > > Allowing races like the one this patch tries to address to exist > > > makes the ALUA code harder to maintain than necessary. Have you > > > considered to make alua_bus_detach() wait until ALUA work has > > > finished by using e.g. cancel_work_sync() or rcu_synchronize()? > > > > to be honest, no, I didn't consider this yet. The current kernel > > crashes with BUG() if an ALUA device is detached at an inopportune > > point in time (not just theoretically, we actually observed this). > > The > > goal of my patch was to fix this with minimum risk to introduce > > other > > problems. The addition in patch 4/4 was an attempt to address the > > concern you had expressed in your review of the v1 patch. > > > > I'm not opposed to try to find a better solution, but could we > > maybe > > get the fix for the BUG() (i.e. patch 3/4) applied in the first > > place? > > AFAICS it would not conflict with a solution like the one you > > suggested. > > Hello Martin, > > Sorry but I don't think it's a good idea to merge patch 3/4 in the > upstream > kernel. Even with that patch applied there is nothing that prevents > that > h->handler_data would be freed while alua_rtpg() is in progress and > hence > that h->sdev is a completely random pointer if alua_rtpg() is > executed > concurrently with alua_bus_detach(). Please do not try to paper over > race > conditions but fix these properly. Hello Bart, please be assured that I'm not trying to paper over anything. Your concern about sdev->handler_data is justified. While I think that it's a separate issue from what my patches were supposed to address, let me see if I can come up with something more comprehensive. It will take time, though, until I fully comprehend the locking concept of scsi_dh_alua.c. Regards, Martin
On Mon, May 15, 2017 at 08:30:19PM +0200, Martin Wilck wrote: > please be assured that I'm not trying to paper over anything. Your > concern about sdev->handler_data is justified. While I think that it's > a separate issue from what my patches were supposed to address, let me > see if I can come up with something more comprehensive. > > It will take time, though, until I fully comprehend the locking concept > of scsi_dh_alua.c. I think a simple call_rcu for the cleanup in alua_bus_detach should do the job. And I agree with Bart that we need to sort this out properly..
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 2b60f493f90e..a59783020c66 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -652,9 +652,15 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) rcu_read_lock(); list_for_each_entry_rcu(h, &tmp_pg->dh_list, node) { - /* h->sdev should always be valid */ - BUG_ON(!h->sdev); - h->sdev->access_state = desc[0]; + /* + * We might be racing with + * alua_bus_detach here + */ + struct scsi_device *sdev = + h->sdev; + if (sdev) + sdev->access_state = + desc[0]; } rcu_read_unlock(); } @@ -694,11 +700,13 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) pg->expiry = 0; rcu_read_lock(); list_for_each_entry_rcu(h, &pg->dh_list, node) { - BUG_ON(!h->sdev); - h->sdev->access_state = + struct scsi_device *sdev = h->sdev; + if (!sdev) + continue; + sdev->access_state = (pg->state & SCSI_ACCESS_STATE_MASK); if (pg->pref) - h->sdev->access_state |= + sdev->access_state |= SCSI_ACCESS_STATE_PREFERRED; } rcu_read_unlock();