Message ID | 20240801082257.506006-1-luciano.coelho@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [core-for-CI] scsi: sd: Move sd_read_cpr() out of the q->limits_lock region | expand |
These is also this made by Luca https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11813 @Nikula, Jani , ok to merge. Already tested at trybot https://patchwork.freedesktop.org/series/136776/ > -----Original Message----- > From: Coelho, Luciano <luciano.coelho@intel.com> > Sent: Thursday, 1 August 2024 11.23 > To: intel-gfx@lists.freedesktop.org > Cc: Saarinen, Jani <jani.saarinen@intel.com> > Subject: [core-for-CI PATCH] scsi: sd: Move sd_read_cpr() out of the q- > >limits_lock region > > From: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > > Commit 804e498e0496 ("sd: convert to the atomic queue limits API") > introduced pairs of function calls to queue_limits_start_update() and > queue_limits_commit_update(). These two functions lock and unlock > q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after > queue_limits_start_update() call and before > queue_limits_commit_update() call. sd_read_cpr() locks q->sysfs_dir_lock > and &q->sysfs_lock. Then new lock dependencies were created between > q->limits_lock, q->sysfs_dir_lock and q->sysfs_lock, as follows: > > sd_revalidate_disk > queue_limits_start_update > mutex_lock(&q->limits_lock) > sd_read_cpr > disk_set_independent_access_ranges > mutex_lock(&q->sysfs_dir_lock) > mutex_lock(&q->sysfs_lock) > mutex_unlock(&q->sysfs_lock) > mutex_unlock(&q->sysfs_dir_lock) > queue_limits_commit_update > mutex_unlock(&q->limits_lock) > > However, the three locks already had reversed dependencies in other places. > Then the new dependencies triggered the lockdep WARN "possible circular > locking dependency detected" [1]. This WARN was observed by running the > blktests test case srp/002. > > To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk() after > the queue_limits_commit_update() call. In other words, move the > sd_read_cpr() call out of the q->limits_lock region. > > [1] https://lore.kernel.org/linux- > scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/ > > Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API") > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > Signed-off-by: Luca Coelho <luciano.coelho@intel.com> > --- > drivers/scsi/sd.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index > adeaa8ab9951..08cbe3815006 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -3753,7 +3753,6 @@ static int sd_revalidate_disk(struct gendisk *disk) > sd_read_block_limits_ext(sdkp); > sd_read_block_characteristics(sdkp, &lim); > sd_zbc_read_zones(sdkp, &lim, buffer); > - sd_read_cpr(sdkp); > } > > sd_print_capacity(sdkp, old_capacity); @@ -3808,6 +3807,14 > @@ static int sd_revalidate_disk(struct gendisk *disk) > if (err) > return err; > > + /* > + * Query concurrent positioning ranges after > + * queue_limits_commit_update() unlocked q->limits_lock to avoid > + * deadlock with q->sysfs_dir_lock and q->sysfs_lock. > + */ > + if (sdkp->media_present && scsi_device_supports_vpd(sdp)) > + sd_read_cpr(sdkp); > + > /* > * For a zoned drive, revalidating the zones can be done only once > * the gendisk capacity is set. So if this fails, set back the gendisk > -- > 2.39.2
On Thu, 01 Aug 2024, "Saarinen, Jani" <jani.saarinen@intel.com> wrote: > These is also this made by Luca https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11813 > > @Nikula, Jani , ok to merge. Already tested at trybot https://patchwork.freedesktop.org/series/136776/ Acked-by: Jani Nikula <jani.nikula@intel.com> The full IGT results aren't in for the trybot submission though. > >> -----Original Message----- >> From: Coelho, Luciano <luciano.coelho@intel.com> >> Sent: Thursday, 1 August 2024 11.23 >> To: intel-gfx@lists.freedesktop.org >> Cc: Saarinen, Jani <jani.saarinen@intel.com> >> Subject: [core-for-CI PATCH] scsi: sd: Move sd_read_cpr() out of the q- >> >limits_lock region >> >> From: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> >> >> Commit 804e498e0496 ("sd: convert to the atomic queue limits API") >> introduced pairs of function calls to queue_limits_start_update() and >> queue_limits_commit_update(). These two functions lock and unlock >> q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after >> queue_limits_start_update() call and before >> queue_limits_commit_update() call. sd_read_cpr() locks q->sysfs_dir_lock >> and &q->sysfs_lock. Then new lock dependencies were created between >> q->limits_lock, q->sysfs_dir_lock and q->sysfs_lock, as follows: >> >> sd_revalidate_disk >> queue_limits_start_update >> mutex_lock(&q->limits_lock) >> sd_read_cpr >> disk_set_independent_access_ranges >> mutex_lock(&q->sysfs_dir_lock) >> mutex_lock(&q->sysfs_lock) >> mutex_unlock(&q->sysfs_lock) >> mutex_unlock(&q->sysfs_dir_lock) >> queue_limits_commit_update >> mutex_unlock(&q->limits_lock) >> >> However, the three locks already had reversed dependencies in other places. >> Then the new dependencies triggered the lockdep WARN "possible circular >> locking dependency detected" [1]. This WARN was observed by running the >> blktests test case srp/002. >> >> To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk() after >> the queue_limits_commit_update() call. In other words, move the >> sd_read_cpr() call out of the q->limits_lock region. >> >> [1] https://lore.kernel.org/linux- >> scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/ >> >> Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API") >> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> >> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> >> --- >> drivers/scsi/sd.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index >> adeaa8ab9951..08cbe3815006 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -3753,7 +3753,6 @@ static int sd_revalidate_disk(struct gendisk *disk) >> sd_read_block_limits_ext(sdkp); >> sd_read_block_characteristics(sdkp, &lim); >> sd_zbc_read_zones(sdkp, &lim, buffer); >> - sd_read_cpr(sdkp); >> } >> >> sd_print_capacity(sdkp, old_capacity); @@ -3808,6 +3807,14 >> @@ static int sd_revalidate_disk(struct gendisk *disk) >> if (err) >> return err; >> >> + /* >> + * Query concurrent positioning ranges after >> + * queue_limits_commit_update() unlocked q->limits_lock to avoid >> + * deadlock with q->sysfs_dir_lock and q->sysfs_lock. >> + */ >> + if (sdkp->media_present && scsi_device_supports_vpd(sdp)) >> + sd_read_cpr(sdkp); >> + >> /* >> * For a zoned drive, revalidating the zones can be done only once >> * the gendisk capacity is set. So if this fails, set back the gendisk >> -- >> 2.39.2 >
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index adeaa8ab9951..08cbe3815006 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3753,7 +3753,6 @@ static int sd_revalidate_disk(struct gendisk *disk) sd_read_block_limits_ext(sdkp); sd_read_block_characteristics(sdkp, &lim); sd_zbc_read_zones(sdkp, &lim, buffer); - sd_read_cpr(sdkp); } sd_print_capacity(sdkp, old_capacity); @@ -3808,6 +3807,14 @@ static int sd_revalidate_disk(struct gendisk *disk) if (err) return err; + /* + * Query concurrent positioning ranges after + * queue_limits_commit_update() unlocked q->limits_lock to avoid + * deadlock with q->sysfs_dir_lock and q->sysfs_lock. + */ + if (sdkp->media_present && scsi_device_supports_vpd(sdp)) + sd_read_cpr(sdkp); + /* * For a zoned drive, revalidating the zones can be done only once * the gendisk capacity is set. So if this fails, set back the gendisk