diff mbox series

[net-next] net: ethernet: mtk_wed: fix possible deadlock if mtk_wed_wo_init fails

Message ID a87f05e60ea1a94b571c9c87b69cc5b0e94943f2.1669999089.git.lorenzo@kernel.org (mailing list archive)
State New, archived
Headers show
Series [net-next] net: ethernet: mtk_wed: fix possible deadlock if mtk_wed_wo_init fails | expand

Commit Message

Lorenzo Bianconi Dec. 2, 2022, 5:36 p.m. UTC
Introduce __mtk_wed_detach() in order to avoid a possible deadlock in
mtk_wed_attach routine if mtk_wed_wo_init fails.

Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/mediatek/mtk_wed.c     | 24 ++++++++++++++-------
 drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++---
 drivers/net/ethernet/mediatek/mtk_wed_wo.c  |  3 +++
 3 files changed, 26 insertions(+), 11 deletions(-)

Comments

Leon Romanovsky Dec. 4, 2022, 1:06 p.m. UTC | #1
On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote:
> Introduce __mtk_wed_detach() in order to avoid a possible deadlock in
> mtk_wed_attach routine if mtk_wed_wo_init fails.
> 
> Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/mediatek/mtk_wed.c     | 24 ++++++++++++++-------
>  drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++---
>  drivers/net/ethernet/mediatek/mtk_wed_wo.c  |  3 +++
>  3 files changed, 26 insertions(+), 11 deletions(-)

<...>

> diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> index f9539e6233c9..b084009a32f9 100644
> --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd,
>  	u16 seq;
>  	int ret;
>  
> +	if (!wo)
> +		return -ENODEV;

<...>

>  static void
>  mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo)
>  {
> +	if (!wo)
> +		return;

How are these changes related to the written in deadlock?
How is it possible to get internal mtk functions without valid wo?

Thanks

> +
>  	/* disable interrupts */
>  	mtk_wed_wo_set_isr(wo, 0);
>  
> -- 
> 2.38.1
>
Lorenzo Bianconi Dec. 4, 2022, 3:09 p.m. UTC | #2
> On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote:
> > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in
> > mtk_wed_attach routine if mtk_wed_wo_init fails.
> > 
> > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support")
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > ---
> >  drivers/net/ethernet/mediatek/mtk_wed.c     | 24 ++++++++++++++-------
> >  drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++---
> >  drivers/net/ethernet/mediatek/mtk_wed_wo.c  |  3 +++
> >  3 files changed, 26 insertions(+), 11 deletions(-)
> 
> <...>
> 
> > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > index f9539e6233c9..b084009a32f9 100644
> > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd,
> >  	u16 seq;
> >  	int ret;
> >  
> > +	if (!wo)
> > +		return -ENODEV;
> 
> <...>
> 
> >  static void
> >  mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo)
> >  {
> > +	if (!wo)
> > +		return;
> 
> How are these changes related to the written in deadlock?
> How is it possible to get internal mtk functions without valid wo?

Hi Leon,

if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running
__mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in
mtk_wed_wo_init()).
Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit()
so we will need to check if wo pointer is properly set. We will face the same
issue if wo allocation fails in mtk_wed_wo_init routine.
If we remove the deadlock we need to take into account even these conditions.

Regards,
Lorenzo

> 
> Thanks
> 
> > +
> >  	/* disable interrupts */
> >  	mtk_wed_wo_set_isr(wo, 0);
> >  
> > -- 
> > 2.38.1
> >
Leon Romanovsky Dec. 5, 2022, 7:29 a.m. UTC | #3
On Sun, Dec 04, 2022 at 04:09:21PM +0100, Lorenzo Bianconi wrote:
> > On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote:
> > > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in
> > > mtk_wed_attach routine if mtk_wed_wo_init fails.
> > > 
> > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support")
> > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > > ---
> > >  drivers/net/ethernet/mediatek/mtk_wed.c     | 24 ++++++++++++++-------
> > >  drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++---
> > >  drivers/net/ethernet/mediatek/mtk_wed_wo.c  |  3 +++
> > >  3 files changed, 26 insertions(+), 11 deletions(-)
> > 
> > <...>
> > 
> > > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > index f9539e6233c9..b084009a32f9 100644
> > > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd,
> > >  	u16 seq;
> > >  	int ret;
> > >  
> > > +	if (!wo)
> > > +		return -ENODEV;
> > 
> > <...>
> > 
> > >  static void
> > >  mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo)
> > >  {
> > > +	if (!wo)
> > > +		return;
> > 
> > How are these changes related to the written in deadlock?
> > How is it possible to get internal mtk functions without valid wo?
> 
> Hi Leon,
> 
> if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running
> __mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in
> mtk_wed_wo_init()).

IMHO, it is a culprit, proper error unwind means that you won't call to
uninit functions for something that is not initialized yet. It is better
to fix it instead of adding "if (!wo) ..." checks.

> Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit()

This is another side of same coin. If you can run them in parallel, you
need locking protection and ability to cancel work, so nothing is going
to be executed once cleanup succeeded.

These were my 2 cents, totally IMHO.

Thanks
Lorenzo Bianconi Dec. 5, 2022, 9:04 a.m. UTC | #4
On Dec 05, Leon Romanovsky wrote:
> On Sun, Dec 04, 2022 at 04:09:21PM +0100, Lorenzo Bianconi wrote:
> > > On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote:
> > > > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in
> > > > mtk_wed_attach routine if mtk_wed_wo_init fails.
> > > > 
> > > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support")
> > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > > > ---
> > > >  drivers/net/ethernet/mediatek/mtk_wed.c     | 24 ++++++++++++++-------
> > > >  drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++---
> > > >  drivers/net/ethernet/mediatek/mtk_wed_wo.c  |  3 +++
> > > >  3 files changed, 26 insertions(+), 11 deletions(-)
> > > 
> > > <...>
> > > 
> > > > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > > index f9539e6233c9..b084009a32f9 100644
> > > > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd,
> > > >  	u16 seq;
> > > >  	int ret;
> > > >  
> > > > +	if (!wo)
> > > > +		return -ENODEV;
> > > 
> > > <...>
> > > 
> > > >  static void
> > > >  mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo)
> > > >  {
> > > > +	if (!wo)
> > > > +		return;
> > > 
> > > How are these changes related to the written in deadlock?
> > > How is it possible to get internal mtk functions without valid wo?
> > 
> > Hi Leon,
> > 
> > if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running
> > __mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in
> > mtk_wed_wo_init()).
> 
> IMHO, it is a culprit, proper error unwind means that you won't call to
> uninit functions for something that is not initialized yet. It is better
> to fix it instead of adding "if (!wo) ..." checks.

So, iiuc, you would prefer to do something like:

__mtk_wed_detach()
{
	...
	if (mtk_wed_get_rx_capa(dev) && wo) {
		mtk_wed_wo_reset(dev);
		mtk_wed_free_rx_rings(dev);
		mtk_wed_wo_deinit(hw);
	}
	...
	
Right? I am fine both ways :)

> 
> > Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit()
> 
> This is another side of same coin. If you can run them in parallel, you
> need locking protection and ability to cancel work, so nothing is going
> to be executed once cleanup succeeded.

Sorry, I did not get what you mean here with 'in parallel'. __mtk_wed_detach()
always run with hw_lock mutex help in both mtk_wed_attach() or
mtk_wed_detach().

Regards,
Lorenzo

> 
> These were my 2 cents, totally IMHO.
> 
> Thanks
Leon Romanovsky Dec. 5, 2022, 9:32 a.m. UTC | #5
On Mon, Dec 05, 2022 at 10:04:07AM +0100, Lorenzo Bianconi wrote:
> On Dec 05, Leon Romanovsky wrote:
> > On Sun, Dec 04, 2022 at 04:09:21PM +0100, Lorenzo Bianconi wrote:
> > > > On Fri, Dec 02, 2022 at 06:36:33PM +0100, Lorenzo Bianconi wrote:
> > > > > Introduce __mtk_wed_detach() in order to avoid a possible deadlock in
> > > > > mtk_wed_attach routine if mtk_wed_wo_init fails.
> > > > > 
> > > > > Fixes: 4c5de09eb0d0 ("net: ethernet: mtk_wed: add configure wed wo support")
> > > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > > > > ---
> > > > >  drivers/net/ethernet/mediatek/mtk_wed.c     | 24 ++++++++++++++-------
> > > > >  drivers/net/ethernet/mediatek/mtk_wed_mcu.c | 10 ++++++---
> > > > >  drivers/net/ethernet/mediatek/mtk_wed_wo.c  |  3 +++
> > > > >  3 files changed, 26 insertions(+), 11 deletions(-)
> > > > 
> > > > <...>
> > > > 
> > > > > diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > > > index f9539e6233c9..b084009a32f9 100644
> > > > > --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > > > +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
> > > > > @@ -176,6 +176,9 @@ int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd,
> > > > >  	u16 seq;
> > > > >  	int ret;
> > > > >  
> > > > > +	if (!wo)
> > > > > +		return -ENODEV;
> > > > 
> > > > <...>
> > > > 
> > > > >  static void
> > > > >  mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo)
> > > > >  {
> > > > > +	if (!wo)
> > > > > +		return;
> > > > 
> > > > How are these changes related to the written in deadlock?
> > > > How is it possible to get internal mtk functions without valid wo?
> > > 
> > > Hi Leon,
> > > 
> > > if mtk_wed_rro_alloc() fails in mtk_wed_attach(), we will end up running
> > > __mtk_wed_detach() when wo struct is not allocated yet (wo is allocated in
> > > mtk_wed_wo_init()).
> > 
> > IMHO, it is a culprit, proper error unwind means that you won't call to
> > uninit functions for something that is not initialized yet. It is better
> > to fix it instead of adding "if (!wo) ..." checks.
> 
> So, iiuc, you would prefer to do something like:
> 
> __mtk_wed_detach()
> {
> 	...
> 	if (mtk_wed_get_rx_capa(dev) && wo) {
> 		mtk_wed_wo_reset(dev);
> 		mtk_wed_free_rx_rings(dev);
> 		mtk_wed_wo_deinit(hw);
> 	}
> 	...
> 	
> Right? I am fine both ways :)

Yes

> 
> > 
> > > Moreover __mtk_wed_detach() can run mtk_wed_wo_reset() and mtk_wed_wo_deinit()
> > 
> > This is another side of same coin. If you can run them in parallel, you
> > need locking protection and ability to cancel work, so nothing is going
> > to be executed once cleanup succeeded.
> 
> Sorry, I did not get what you mean here with 'in parallel'. __mtk_wed_detach()
> always run with hw_lock mutex help in both mtk_wed_attach() or
> mtk_wed_detach().

Lock is not enough, you need to make sure that no underlying code is
called without wo. You suggestion above is fine. The less low level code
will have "if (!wo) ...", the better will be.

Thanks

> 
> Regards,
> Lorenzo
> 
> > 
> > These were my 2 cents, totally IMHO.
> > 
> > Thanks
Jakub Kicinski Dec. 6, 2022, 1:44 a.m. UTC | #6
On Mon, 5 Dec 2022 10:04:07 +0100 Lorenzo Bianconi wrote:
> > IMHO, it is a culprit, proper error unwind means that you won't call to
> > uninit functions for something that is not initialized yet. It is better
> > to fix it instead of adding "if (!wo) ..." checks.  
> 
> So, iiuc, you would prefer to do something like:
> 
> __mtk_wed_detach()
> {
> 	...
> 	if (mtk_wed_get_rx_capa(dev) && wo) {
> 		mtk_wed_wo_reset(dev);
> 		mtk_wed_free_rx_rings(dev);
> 		mtk_wed_wo_deinit(hw);
> 	}
> 	...
> 	
> Right? I am fine both ways :)

FWIW, that does seem slightly better to me as well.
Also - aren't you really fixing multiple issues here 
(even if on the same error path)? The locking, 
the null-checking and the change in mtk_wed_wo_reset()?
Lorenzo Bianconi Dec. 6, 2022, 11:52 p.m. UTC | #7
> > > IMHO, it is a culprit, proper error unwind means that you won't call to
> > > uninit functions for something that is not initialized yet. It is better
> > > to fix it instead of adding "if (!wo) ..." checks.  
> > 
> > So, iiuc, you would prefer to do something like:
> > 
> > __mtk_wed_detach()
> > {
> > 	...
> > 	if (mtk_wed_get_rx_capa(dev) && wo) {
> > 		mtk_wed_wo_reset(dev);
> > 		mtk_wed_free_rx_rings(dev);
> > 		mtk_wed_wo_deinit(hw);
> > 	}
> > 	...
> > 	
> > Right? I am fine both ways :)
> 
> FWIW, that does seem slightly better to me as well.
> Also - aren't you really fixing multiple issues here 
> (even if on the same error path)? The locking, 
> the null-checking and the change in mtk_wed_wo_reset()?

wo NULL pointer issue was not hit before for the deadlock one (so I fixed them
in the same patch).
Do you prefer to split them in two patches? (wo null pointer fix first).

I have posted v2 addressing Leon's comments but I need to post a v3 to add
missing WARN_ON.

Regards,
Lorenzo
Jakub Kicinski Dec. 7, 2022, 1:24 a.m. UTC | #8
On Wed, 7 Dec 2022 00:52:28 +0100 Lorenzo Bianconi wrote:
> > FWIW, that does seem slightly better to me as well.
> > Also - aren't you really fixing multiple issues here 
> > (even if on the same error path)? The locking, 
> > the null-checking and the change in mtk_wed_wo_reset()?  
> 
> wo NULL pointer issue was not hit before for the deadlock one (so I fixed them
> in the same patch).
> Do you prefer to split them in two patches? (wo null pointer fix first).

Yes, I think they are different issues even if once "covers" the other.
I think it'd make the review / judgment easier.

> I have posted v2 addressing Leon's comments but I need to post a v3 to add
> missing WARN_ON.
Lorenzo Bianconi Dec. 7, 2022, 8:58 a.m. UTC | #9
> On Wed, 7 Dec 2022 00:52:28 +0100 Lorenzo Bianconi wrote:
> > > FWIW, that does seem slightly better to me as well.
> > > Also - aren't you really fixing multiple issues here 
> > > (even if on the same error path)? The locking, 
> > > the null-checking and the change in mtk_wed_wo_reset()?  
> > 
> > wo NULL pointer issue was not hit before for the deadlock one (so I fixed them
> > in the same patch).
> > Do you prefer to split them in two patches? (wo null pointer fix first).
> 
> Yes, I think they are different issues even if once "covers" the other.
> I think it'd make the review / judgment easier.

ok, I will post v3 splitting them.

Regards,
Lorenzo

> 
> > I have posted v2 addressing Leon's comments but I need to post a v3 to add
> > missing WARN_ON.
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mediatek/mtk_wed.c b/drivers/net/ethernet/mediatek/mtk_wed.c
index d041615b2bac..6352abd4157e 100644
--- a/drivers/net/ethernet/mediatek/mtk_wed.c
+++ b/drivers/net/ethernet/mediatek/mtk_wed.c
@@ -174,9 +174,10 @@  mtk_wed_wo_reset(struct mtk_wed_device *dev)
 	mtk_wdma_tx_reset(dev);
 	mtk_wed_reset(dev, MTK_WED_RESET_WED);
 
-	mtk_wed_mcu_send_msg(wo, MTK_WED_MODULE_ID_WO,
-			     MTK_WED_WO_CMD_CHANGE_STATE, &state,
-			     sizeof(state), false);
+	if (mtk_wed_mcu_send_msg(wo, MTK_WED_MODULE_ID_WO,
+				 MTK_WED_WO_CMD_CHANGE_STATE, &state,
+				 sizeof(state), false))
+		return;
 
 	if (readx_poll_timeout(mtk_wed_wo_read_status, dev, val,
 			       val == MTK_WED_WOIF_DISABLE_DONE,
@@ -576,12 +577,10 @@  mtk_wed_deinit(struct mtk_wed_device *dev)
 }
 
 static void
-mtk_wed_detach(struct mtk_wed_device *dev)
+__mtk_wed_detach(struct mtk_wed_device *dev)
 {
 	struct mtk_wed_hw *hw = dev->hw;
 
-	mutex_lock(&hw_lock);
-
 	mtk_wed_deinit(dev);
 
 	mtk_wdma_rx_reset(dev);
@@ -612,6 +611,13 @@  mtk_wed_detach(struct mtk_wed_device *dev)
 	module_put(THIS_MODULE);
 
 	hw->wed_dev = NULL;
+}
+
+static void
+mtk_wed_detach(struct mtk_wed_device *dev)
+{
+	mutex_lock(&hw_lock);
+	__mtk_wed_detach(dev);
 	mutex_unlock(&hw_lock);
 }
 
@@ -1490,8 +1496,10 @@  mtk_wed_attach(struct mtk_wed_device *dev)
 		ret = mtk_wed_wo_init(hw);
 	}
 out:
-	if (ret)
-		mtk_wed_detach(dev);
+	if (ret) {
+		dev_err(dev->hw->dev, "failed to attach wed device\n");
+		__mtk_wed_detach(dev);
+	}
 unlock:
 	mutex_unlock(&hw_lock);
 
diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
index f9539e6233c9..b084009a32f9 100644
--- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
+++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c
@@ -176,6 +176,9 @@  int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd,
 	u16 seq;
 	int ret;
 
+	if (!wo)
+		return -ENODEV;
+
 	skb = mtk_wed_mcu_msg_alloc(data, len);
 	if (!skb)
 		return -ENOMEM;
@@ -202,13 +205,14 @@  int mtk_wed_mcu_send_msg(struct mtk_wed_wo *wo, int id, int cmd,
 int mtk_wed_mcu_msg_update(struct mtk_wed_device *dev, int id, void *data,
 			   int len)
 {
-	struct mtk_wed_wo *wo = dev->hw->wed_wo;
+	if (!dev->hw || !dev->hw->wed_wo)
+		return 0;
 
 	if (dev->hw->version == 1)
 		return 0;
 
-	return mtk_wed_mcu_send_msg(wo, MTK_WED_MODULE_ID_WO, id, data, len,
-				    true);
+	return mtk_wed_mcu_send_msg(dev->hw->wed_wo, MTK_WED_MODULE_ID_WO, id,
+				    data, len, true);
 }
 
 static int
diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.c b/drivers/net/ethernet/mediatek/mtk_wed_wo.c
index a219da85f4db..92440d62e01c 100644
--- a/drivers/net/ethernet/mediatek/mtk_wed_wo.c
+++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.c
@@ -464,6 +464,9 @@  mtk_wed_wo_hardware_init(struct mtk_wed_wo *wo)
 static void
 mtk_wed_wo_hw_deinit(struct mtk_wed_wo *wo)
 {
+	if (!wo)
+		return;
+
 	/* disable interrupts */
 	mtk_wed_wo_set_isr(wo, 0);