diff mbox series

scsi: pm80xx: Remove msleep() loop from pm8001_dev_gone_notify()

Message ID 20240709160013.634308-1-tadamsjr@google.com (mailing list archive)
State Changes Requested
Headers show
Series scsi: pm80xx: Remove msleep() loop from pm8001_dev_gone_notify() | expand

Commit Message

TJ Adams July 9, 2024, 4 p.m. UTC
From: Igor Pylypiv <ipylypiv@google.com>

It's possible to end up in a state where pm8001_dev->running_req never
reaches zero. In that state we will be sleeping forever.

sas_execute_internal_abort_dev() can wait for a response for
up to 60 seconds (3 retries x 20 seconds). 60 seconds should be enough
for pm8001_dev->running_req to get to zero.

Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: TJ Adams <tadamsjr@google.com>
---
 drivers/scsi/pm8001/pm8001_sas.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

John Garry July 9, 2024, 4:09 p.m. UTC | #1
On 09/07/2024 17:00, TJ Adams wrote:
> From: Igor Pylypiv <ipylypiv@google.com>
> 
> It's possible to end up in a state where pm8001_dev->running_req never
> reaches zero.

Is that a driver bug then?

> In that state we will be sleeping forever.
> 
> sas_execute_internal_abort_dev() can wait for a response for
> up to 60 seconds (3 retries x 20 seconds). 60 seconds should be enough
> for pm8001_dev->running_req to get to zero.

May I suggest you drop running_req at some stage, and use other methods 
to find how many IOs are active?

> 
> Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
> Signed-off-by: TJ Adams <tadamsjr@google.com>
> ---
>   drivers/scsi/pm8001/pm8001_sas.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
> index a5a31dfa4512..513e9a49838c 100644
> --- a/drivers/scsi/pm8001/pm8001_sas.c
> +++ b/drivers/scsi/pm8001/pm8001_sas.c
> @@ -712,8 +712,11 @@ static void pm8001_dev_gone_notify(struct domain_device *dev)
>   		if (atomic_read(&pm8001_dev->running_req)) {
>   			spin_unlock_irqrestore(&pm8001_ha->lock, flags);
>   			sas_execute_internal_abort_dev(dev, 0, NULL);
> -			while (atomic_read(&pm8001_dev->running_req))
> -				msleep(20);
> +			if (atomic_read(&pm8001_dev->running_req)) {
> +				pm8001_dbg(pm8001_ha, FAIL,
> +					   "device_id: %u: Failed to abort %d requests!\n",
> +					   device_id, atomic_read(&pm8001_dev->running_req));
> +			}
>   			spin_lock_irqsave(&pm8001_ha->lock, flags);
>   		}
>   		PM8001_CHIP_DISP->dereg_dev_req(pm8001_ha, device_id);
diff mbox series

Patch

diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index a5a31dfa4512..513e9a49838c 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -712,8 +712,11 @@  static void pm8001_dev_gone_notify(struct domain_device *dev)
 		if (atomic_read(&pm8001_dev->running_req)) {
 			spin_unlock_irqrestore(&pm8001_ha->lock, flags);
 			sas_execute_internal_abort_dev(dev, 0, NULL);
-			while (atomic_read(&pm8001_dev->running_req))
-				msleep(20);
+			if (atomic_read(&pm8001_dev->running_req)) {
+				pm8001_dbg(pm8001_ha, FAIL,
+					   "device_id: %u: Failed to abort %d requests!\n",
+					   device_id, atomic_read(&pm8001_dev->running_req));
+			}
 			spin_lock_irqsave(&pm8001_ha->lock, flags);
 		}
 		PM8001_CHIP_DISP->dereg_dev_req(pm8001_ha, device_id);