Message ID | 20230612120733.3079507-2-ogabbay@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/3] accel/habanalabs: remove pdev check on idle check | expand |
On 12/06/2023 15:07, Oded Gabbay wrote: If scrubbing memory after user released device has failed it means the device is in a bad state and should be reset. Signed-off-by: Oded Gabbay <ogabbay@kernel.org><mailto:ogabbay@kernel.org> --- drivers/accel/habanalabs/common/device.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index 5e61761b8c11..d7d9198b2103 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -454,8 +454,10 @@ static void hpriv_release(struct kref *ref) /* Scrubbing is handled within hl_device_reset(), so here need to do it directly */ int rc = hdev->asic_funcs->scrub_device_mem(hdev); - if (rc) + if (rc) { dev_err(hdev->dev, "failed to scrub memory from hpriv release (%d)\n", rc); + hl_device_reset(hdev, HL_DRV_RESET_HARD); + } } /* Now we can mark the compute_ctx as not active. Even if a reset is running in a different Reviewed-by: Ofir Bitton <obitton@habana.ai<mailto:obitton@habana.ai>>
diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index 5e61761b8c11..d7d9198b2103 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -454,8 +454,10 @@ static void hpriv_release(struct kref *ref) /* Scrubbing is handled within hl_device_reset(), so here need to do it directly */ int rc = hdev->asic_funcs->scrub_device_mem(hdev); - if (rc) + if (rc) { dev_err(hdev->dev, "failed to scrub memory from hpriv release (%d)\n", rc); + hl_device_reset(hdev, HL_DRV_RESET_HARD); + } } /* Now we can mark the compute_ctx as not active. Even if a reset is running in a different
If scrubbing memory after user released device has failed it means the device is in a bad state and should be reset. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> --- drivers/accel/habanalabs/common/device.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)