diff mbox series

[1/3] dmaengine: idxd: Let probe fail when workqueue cannot be enabled

Message ID 1e74e8d74255ff47271c4c9eada7635676ccd320.1670005163.git.reinette.chatre@intel.com (mailing list archive)
State Superseded
Headers show
Series dmaengine: idxd: Error path fixes | expand

Commit Message

Reinette Chatre Dec. 2, 2022, 6:25 p.m. UTC
The workqueue is enabled when the appropriate driver is loaded and
disabled when the driver is removed. When the driver is removed it
assumes that the workqueue was enabled successfully and proceeds to
free allocations made during workqueue enabling.

Failure during workqueue enabling does not prevent the driver from
being loaded. This is because the error path within drv_enable_wq()
returns success unless a second failure is encountered
during the error path. By returning success it is possible to load
the driver even if the workqueue cannot be enabled and
allocations that do not exist are attempted to be freed during
driver remove.

Some examples of problematic flows:
(a)

 idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
 In above flow, if idxd_wq_request_irq() fails then
 idxd_wq_unmap_portal() is called on error exit path, but
 drv_enable_wq() returns 0 because idxd_wq_disable() succeeds. The
 driver is thus loaded successfully.

 idxd_dmaengine_drv_remove()->drv_disable_wq()->idxd_wq_unmap_portal()
 Above flow on driver unload triggers the WARN in devm_iounmap() because
 the device resource has already been removed during error path of
 drv_enable_wq().

(b)

 idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
 In above flow, if idxd_wq_request_irq() fails then
 idxd_wq_init_percpu_ref() is never called to initialize the percpu
 counter, yet the driver loads successfully because drv_enable_wq()
 returns 0.

 idxd_dmaengine_drv_remove()->__idxd_wq_quiesce()->percpu_ref_kill():
 Above flow on driver unload triggers a BUG when attempting to drop the
 initial ref of the uninitialized percpu ref:
 BUG: kernel NULL pointer dereference, address: 0000000000000010

Fix the drv_enable_wq() error path by returning the original error that
indicates failure of workqueue enabling. This ensures that the probe
fails when an error is encountered and the driver remove paths are only
attempted when the workqueue was enabled successfully.

Fixes: 1f2bb40337f0 ("dmaengine: idxd: move wq_enable() to device.c")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 drivers/dma/idxd/device.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Dave Jiang Dec. 2, 2022, 6:44 p.m. UTC | #1
On 12/2/2022 11:25 AM, Reinette Chatre wrote:
> The workqueue is enabled when the appropriate driver is loaded and
> disabled when the driver is removed. When the driver is removed it
> assumes that the workqueue was enabled successfully and proceeds to
> free allocations made during workqueue enabling.
> 
> Failure during workqueue enabling does not prevent the driver from
> being loaded. This is because the error path within drv_enable_wq()
> returns success unless a second failure is encountered
> during the error path. By returning success it is possible to load
> the driver even if the workqueue cannot be enabled and
> allocations that do not exist are attempted to be freed during
> driver remove.
> 
> Some examples of problematic flows:
> (a)
> 
>   idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
>   In above flow, if idxd_wq_request_irq() fails then
>   idxd_wq_unmap_portal() is called on error exit path, but
>   drv_enable_wq() returns 0 because idxd_wq_disable() succeeds. The
>   driver is thus loaded successfully.
> 
>   idxd_dmaengine_drv_remove()->drv_disable_wq()->idxd_wq_unmap_portal()
>   Above flow on driver unload triggers the WARN in devm_iounmap() because
>   the device resource has already been removed during error path of
>   drv_enable_wq().
> 
> (b)
> 
>   idxd_dmaengine_drv_probe() -> drv_enable_wq() -> idxd_wq_request_irq():
>   In above flow, if idxd_wq_request_irq() fails then
>   idxd_wq_init_percpu_ref() is never called to initialize the percpu
>   counter, yet the driver loads successfully because drv_enable_wq()
>   returns 0.
> 
>   idxd_dmaengine_drv_remove()->__idxd_wq_quiesce()->percpu_ref_kill():
>   Above flow on driver unload triggers a BUG when attempting to drop the
>   initial ref of the uninitialized percpu ref:
>   BUG: kernel NULL pointer dereference, address: 0000000000000010
> 
> Fix the drv_enable_wq() error path by returning the original error that
> indicates failure of workqueue enabling. This ensures that the probe
> fails when an error is encountered and the driver remove paths are only
> attempted when the workqueue was enabled successfully.
> 
> Fixes: 1f2bb40337f0 ("dmaengine: idxd: move wq_enable() to device.c")
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> ---
>   drivers/dma/idxd/device.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
> index 6f44fa8f78a5..fcd03d29a941 100644
> --- a/drivers/dma/idxd/device.c
> +++ b/drivers/dma/idxd/device.c
> @@ -1391,8 +1391,7 @@ int drv_enable_wq(struct idxd_wq *wq)
>   err_irq:
>   	idxd_wq_unmap_portal(wq);
>   err_map_portal:
> -	rc = idxd_wq_disable(wq, false);
> -	if (rc < 0)
> +	if (idxd_wq_disable(wq, false))
>   		dev_dbg(dev, "wq %s disable failed\n", dev_name(wq_confdev(wq)));
>   err:
>   	return rc;
Fenghua Yu Dec. 2, 2022, 6:52 p.m. UTC | #2
> The workqueue is enabled when the appropriate driver is loaded and disabled
> when the driver is removed. When the driver is removed it assumes that the
> workqueue was enabled successfully and proceeds to free allocations made
> during workqueue enabling.
...
> Fixes: 1f2bb40337f0 ("dmaengine: idxd: move wq_enable() to device.c")
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>

Thanks.

-Fenghua
diff mbox series

Patch

diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
index 6f44fa8f78a5..fcd03d29a941 100644
--- a/drivers/dma/idxd/device.c
+++ b/drivers/dma/idxd/device.c
@@ -1391,8 +1391,7 @@  int drv_enable_wq(struct idxd_wq *wq)
 err_irq:
 	idxd_wq_unmap_portal(wq);
 err_map_portal:
-	rc = idxd_wq_disable(wq, false);
-	if (rc < 0)
+	if (idxd_wq_disable(wq, false))
 		dev_dbg(dev, "wq %s disable failed\n", dev_name(wq_confdev(wq)));
 err:
 	return rc;