diff mbox series

coresight: Fix possible deadlock with lock dependency

Message ID 20220721130329.3787211-1-sudeep.holla@arm.com (mailing list archive)
State New, archived
Headers show
Series coresight: Fix possible deadlock with lock dependency | expand

Commit Message

Sudeep Holla July 21, 2022, 1:03 p.m. UTC
With lockdeps enabled, we get the following warning:

======================================================
WARNING: possible circular locking dependency detected
------------------------------------------------------
kworker/u12:1/53 is trying to acquire lock:
ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
but task is already holding lock:
ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394

which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:

-> #1 (ect_mutex){+.+.}-{4:4}:
       __mutex_lock_common+0xd8/0xe60
       mutex_lock_nested+0x44/0x50
       cti_add_assoc_to_csdev+0x4c/0x184
       coresight_register+0x2f0/0x314
       tmc_probe+0x33c/0x414

-> #0 (coresight_mutex){+.+.}-{4:4}:
       __lock_acquire+0x1a20/0x32d0
       lock_acquire+0x160/0x308
       __mutex_lock_common+0xd8/0xe60
       mutex_lock_nested+0x44/0x50
       coresight_set_assoc_ectdev_mutex+0x3c/0x5c
       cti_update_conn_xrefs+0x6c/0xf8
       cti_probe+0x33c/0x394

other info that might help us debug this:
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(ect_mutex);
                               lock(coresight_mutex);
                               lock(ect_mutex);
  lock(coresight_mutex);
 *** DEADLOCK ***

4 locks held by kworker/u12:1/53:
 #0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
 #1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
 #2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
 #3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394

To fix the same, call cti_add_assoc_to_csdev without the holding
coresight_mutex and confine the locking while setting the associated
ect / cti device using coresight_set_assoc_ectdev_mutex().

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/hwtracing/coresight/coresight-core.c     | 7 ++++---
 drivers/hwtracing/coresight/coresight-cti-core.c | 5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

Hi,

I am not sure if I missed to consider something, but this is something
I think could be the fix.

Regards,
Sudeep

Comments

Mike Leach Aug. 15, 2022, 2:48 p.m. UTC | #1
Hi Sudeep,

On Thu, 21 Jul 2022 at 14:03, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> With lockdeps enabled, we get the following warning:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> ------------------------------------------------------
> kworker/u12:1/53 is trying to acquire lock:
> ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
> but task is already holding lock:
> ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
>
> which lock already depends on the new lock.
> the existing dependency chain (in reverse order) is:
>
> -> #1 (ect_mutex){+.+.}-{4:4}:
>        __mutex_lock_common+0xd8/0xe60
>        mutex_lock_nested+0x44/0x50
>        cti_add_assoc_to_csdev+0x4c/0x184
>        coresight_register+0x2f0/0x314
>        tmc_probe+0x33c/0x414
>
> -> #0 (coresight_mutex){+.+.}-{4:4}:
>        __lock_acquire+0x1a20/0x32d0
>        lock_acquire+0x160/0x308
>        __mutex_lock_common+0xd8/0xe60
>        mutex_lock_nested+0x44/0x50
>        coresight_set_assoc_ectdev_mutex+0x3c/0x5c
>        cti_update_conn_xrefs+0x6c/0xf8
>        cti_probe+0x33c/0x394
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>        CPU0                    CPU1
>        ----                    ----
>   lock(ect_mutex);
>                                lock(coresight_mutex);
>                                lock(ect_mutex);
>   lock(coresight_mutex);
>  *** DEADLOCK ***
>
> 4 locks held by kworker/u12:1/53:
>  #0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
>  #1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
>  #2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
>  #3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
>
> To fix the same, call cti_add_assoc_to_csdev without the holding
> coresight_mutex and confine the locking while setting the associated
> ect / cti device using coresight_set_assoc_ectdev_mutex().
>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-core.c     | 7 ++++---
>  drivers/hwtracing/coresight/coresight-cti-core.c | 5 +++--
>  2 files changed, 7 insertions(+), 5 deletions(-)
>
> Hi,
>
> I am not sure if I missed to consider something, but this is something
> I think could be the fix.
>
> Regards,
> Sudeep
>

I have been running with lockdep testing coresight for a number of
weeks now, without encountering the same issue with CTIs.
That said, I am on a DB410, and the code in question is designed to
take different paths depending on whether the CTI or other device is
discovered first, so this could well be something different in the
discovery order between juno and DB410.

The change looks perfectly fine to me.

As such:
Reviewed-by: Mike Leach <mike.leach@linaro.org>

> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 1edfec1e9d18..275b4dce8401 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -1659,14 +1659,15 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
>                 ret = coresight_fixup_device_conns(csdev);
>         if (!ret)
>                 ret = coresight_fixup_orphan_conns(csdev);
> -       if (!ret && cti_assoc_ops && cti_assoc_ops->add)
> -               cti_assoc_ops->add(csdev);
>
>  out_unlock:
>         mutex_unlock(&coresight_mutex);
>         /* Success */
> -       if (!ret)
> +       if (!ret) {
> +               if (cti_assoc_ops && cti_assoc_ops->add)
> +                       cti_assoc_ops->add(csdev);
>                 return csdev;
> +       }
>
>         /* Unregister the device if needed */
>         if (registered) {
> diff --git a/drivers/hwtracing/coresight/coresight-cti-core.c b/drivers/hwtracing/coresight/coresight-cti-core.c
> index 8988b2ed2ea6..1be92342b5b9 100644
> --- a/drivers/hwtracing/coresight/coresight-cti-core.c
> +++ b/drivers/hwtracing/coresight/coresight-cti-core.c
> @@ -541,7 +541,7 @@ cti_match_fixup_csdev(struct cti_device *ctidev, const char *node_name,
>  /*
>   * Search the cti list to add an associated CTI into the supplied CS device
>   * This will set the association if CTI declared before the CS device.
> - * (called from coresight_register() with coresight_mutex locked).
> + * (called from coresight_register() without coresight_mutex locked).
>   */
>  static void cti_add_assoc_to_csdev(struct coresight_device *csdev)
>  {
> @@ -569,7 +569,8 @@ static void cti_add_assoc_to_csdev(struct coresight_device *csdev)
>                          * if we found a matching csdev then update the ECT
>                          * association pointer for the device with this CTI.
>                          */
> -                       csdev->ect_dev = ect_item->csdev;
> +                       coresight_set_assoc_ectdev_mutex(csdev->ect_dev,
> +                                                        ect_item->csdev);
>                         break;
>                 }
>         }
> --
> 2.37.1
>
Suzuki K Poulose Oct. 5, 2022, 9:35 a.m. UTC | #2
On 21/07/2022 14:03, Sudeep Holla wrote:
> With lockdeps enabled, we get the following warning:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> ------------------------------------------------------
> kworker/u12:1/53 is trying to acquire lock:
> ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
> but task is already holding lock:
> ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
> 
> which lock already depends on the new lock.
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (ect_mutex){+.+.}-{4:4}:
>         __mutex_lock_common+0xd8/0xe60
>         mutex_lock_nested+0x44/0x50
>         cti_add_assoc_to_csdev+0x4c/0x184
>         coresight_register+0x2f0/0x314
>         tmc_probe+0x33c/0x414
> 
> -> #0 (coresight_mutex){+.+.}-{4:4}:
>         __lock_acquire+0x1a20/0x32d0
>         lock_acquire+0x160/0x308
>         __mutex_lock_common+0xd8/0xe60
>         mutex_lock_nested+0x44/0x50
>         coresight_set_assoc_ectdev_mutex+0x3c/0x5c
>         cti_update_conn_xrefs+0x6c/0xf8
>         cti_probe+0x33c/0x394
> 
> other info that might help us debug this:
>   Possible unsafe locking scenario:
>         CPU0                    CPU1
>         ----                    ----
>    lock(ect_mutex);
>                                 lock(coresight_mutex);
>                                 lock(ect_mutex);
>    lock(coresight_mutex);
>   *** DEADLOCK ***
> 
> 4 locks held by kworker/u12:1/53:
>   #0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
>   #1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
>   #2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
>   #3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
> 
> To fix the same, call cti_add_assoc_to_csdev without the holding
> coresight_mutex and confine the locking while setting the associated
> ect / cti device using coresight_set_assoc_ectdev_mutex().
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

Thanks for the fix. I will push this out at rc1 as fixes.

Thanks
Suzuki
diff mbox series

Patch

diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index 1edfec1e9d18..275b4dce8401 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -1659,14 +1659,15 @@  struct coresight_device *coresight_register(struct coresight_desc *desc)
 		ret = coresight_fixup_device_conns(csdev);
 	if (!ret)
 		ret = coresight_fixup_orphan_conns(csdev);
-	if (!ret && cti_assoc_ops && cti_assoc_ops->add)
-		cti_assoc_ops->add(csdev);
 
 out_unlock:
 	mutex_unlock(&coresight_mutex);
 	/* Success */
-	if (!ret)
+	if (!ret) {
+		if (cti_assoc_ops && cti_assoc_ops->add)
+			cti_assoc_ops->add(csdev);
 		return csdev;
+	}
 
 	/* Unregister the device if needed */
 	if (registered) {
diff --git a/drivers/hwtracing/coresight/coresight-cti-core.c b/drivers/hwtracing/coresight/coresight-cti-core.c
index 8988b2ed2ea6..1be92342b5b9 100644
--- a/drivers/hwtracing/coresight/coresight-cti-core.c
+++ b/drivers/hwtracing/coresight/coresight-cti-core.c
@@ -541,7 +541,7 @@  cti_match_fixup_csdev(struct cti_device *ctidev, const char *node_name,
 /*
  * Search the cti list to add an associated CTI into the supplied CS device
  * This will set the association if CTI declared before the CS device.
- * (called from coresight_register() with coresight_mutex locked).
+ * (called from coresight_register() without coresight_mutex locked).
  */
 static void cti_add_assoc_to_csdev(struct coresight_device *csdev)
 {
@@ -569,7 +569,8 @@  static void cti_add_assoc_to_csdev(struct coresight_device *csdev)
 			 * if we found a matching csdev then update the ECT
 			 * association pointer for the device with this CTI.
 			 */
-			csdev->ect_dev = ect_item->csdev;
+			coresight_set_assoc_ectdev_mutex(csdev->ect_dev,
+							 ect_item->csdev);
 			break;
 		}
 	}