[v2] usb: ucsi: fix connector partner ucsi work issue

Message ID	1672897360-24257-1-git-send-email-quic_linyyuan@quicinc.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-usb-owner@vger.kernel.org> From: Linyu Yuan <quic_linyyuan@quicinc.com> To: Heikki Krogerus <heikki.krogerus@linux.intel.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: <linux-usb@vger.kernel.org>, Jack Pham <quic_jackp@quicinc.com>, "Wesley Cheng" <quic_wcheng@quicinc.com>, Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>, Fenglin Wu <quic_fenglinw@quicinc.com>, "Linyu Yuan" <quic_linyyuan@quicinc.com> Subject: [PATCH v2] usb: ucsi: fix connector partner ucsi work issue Date: Thu, 5 Jan 2023 13:42:40 +0800 Message-ID: <1672897360-24257-1-git-send-email-quic_linyyuan@quicinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain Precedence: bulk
Series	[v2] usb: ucsi: fix connector partner ucsi work issue \| expand [v2] usb: ucsi: fix connector partner ucsi work issue

Message ID

1672897360-24257-1-git-send-email-quic_linyyuan@quicinc.com (mailing list archive)

State

New, archived

Headers

From: Linyu Yuan <quic_linyyuan@quicinc.com>
To: Heikki Krogerus <heikki.krogerus@linux.intel.com>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: <linux-usb@vger.kernel.org>, Jack Pham <quic_jackp@quicinc.com>,
        "Wesley Cheng" <quic_wcheng@quicinc.com>,
        Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>,
        Fenglin Wu <quic_fenglinw@quicinc.com>,
        "Linyu Yuan" <quic_linyyuan@quicinc.com>
Subject: [PATCH v2] usb: ucsi: fix connector partner ucsi work issue
Date: Thu, 5 Jan 2023 13:42:40 +0800
Message-ID: <1672897360-24257-1-git-send-email-quic_linyyuan@quicinc.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain
Precedence: bulk

Series

[v2] usb: ucsi: fix connector partner ucsi work issue | expand

Commit Message

Linyu Yuan Jan. 5, 2023, 5:42 a.m. UTC

When a PPM client unregisters with UCSI framework, connector specific work
queue is destroyed. However, a pending delayed work queued before may
still exist. Once the delay timer expires and the work is scheduled,
this can cause a system crash as the workqueue is destroyed already.

Fix this by moving all partner related delayed work to connector instance
and cancel all of them when ucsi_unregister() is called by PPM client.

Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking")
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
---
v2: update commit description and remove cc stable list

 drivers/usb/typec/ucsi/ucsi.c | 61 +++++++++++++++++++++++++------------------
 drivers/usb/typec/ucsi/ucsi.h | 11 ++++++++
 2 files changed, 46 insertions(+), 26 deletions(-)

Comments

Heikki Krogerus Jan. 5, 2023, 11:27 a.m. UTC | #1

Hi,

On Thu, Jan 05, 2023 at 01:42:40PM +0800, Linyu Yuan wrote:
> When a PPM client unregisters with UCSI framework, connector specific work
> queue is destroyed. However, a pending delayed work queued before may
> still exist. Once the delay timer expires and the work is scheduled,
> this can cause a system crash as the workqueue is destroyed already.

When the workqueue is destroyed it's also flushed, no? So how could
there be still something pending, delayed or not? Did you actually see
this happening?

> Fix this by moving all partner related delayed work to connector instance
> and cancel all of them when ucsi_unregister() is called by PPM client.

I would love to be able to cancel these works, though not because of
driver removal - I'm more concerned about disconnections. The reason
why each task is a unique work is because it allows the driver to add
the same task to the queue as many times as needed, and that was
needed inorder to recover from some firmware issues. If there's only a
dedicated work per task like in your proposal, we can only schedule
the task once at a time, and that may lead into other issues.

But first things first. Is the problem that you are describing here a
real problem that can actually happen?

thanks,

Jack Pham Jan. 5, 2023, 6:34 p.m. UTC | #2

Hi Heikki,

On Thu, Jan 05, 2023 at 01:27:07PM +0200, Heikki Krogerus wrote:
> Hi,
> 
> On Thu, Jan 05, 2023 at 01:42:40PM +0800, Linyu Yuan wrote:
> > When a PPM client unregisters with UCSI framework, connector specific work
> > queue is destroyed. However, a pending delayed work queued before may
> > still exist. Once the delay timer expires and the work is scheduled,
> > this can cause a system crash as the workqueue is destroyed already.
> 
> When the workqueue is destroyed it's also flushed, no? So how could
> there be still something pending, delayed or not? Did you actually see
> this happening?

Yes, we encountered this during a scenario in which our PPM firmware 
is undergoing a reset which is handled by calling ucsi_unregister().
The connectors' workqueues are destroyed but apparently the
destroy_workqueue() does *not* seem to take care pending delayed items.

	Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
	...
	Call trace:
	 __queue_work+0x1f4/0x550
	 delayed_work_timer_fn+0x28/0x38
	 call_timer_fn+0x3c/0x238
	 expire_timers+0xcc/0x168
	 __run_timers+0x194/0x1f8
	 run_timer_softirq+0x2c/0x54
	 _stext+0xec/0x3a8
	 __irq_exit_rcu+0xa0/0xfc
	 irq_exit_rcu+0x18/0x28
	 el0_interrupt+0x5c/0x138
	 __el0_irq_handler_common+0x20/0x30
	 el0t_64_irq_handler+0x18/0x28
	 el0t_64_irq+0x1a0/0x1a4
	Code: eb16013f 54000300 aa1a03e0 9441be2a (f9400280) 

In particular this is happening for the ucsi_check_connection() which is
the currently the only work item using a non-zero delay.  If we look
closely at queue_delayed_work() we can see in that case it defers by
using a separate timer:

	static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
					struct delayed_work *dwork, unsigned long delay)
	{
		struct timer_list *timer = &dwork->timer;
		struct work_struct *work = &dwork->work;

		<...snip...>

		/*
		 * If @delay is 0, queue @dwork->work immediately.  This is for
		 * both optimization and correctness.  The earliest @timer can
		 * expire is on the closest next tick and delayed_work users depend
		 * on that there's no such delay when @delay is 0.
		 */
		if (!delay) {
			__queue_work(cpu, wq, &dwork->work);
			return;
		}

		dwork->wq = wq;
		       ^^^^^^^^ wq gets saved here, but might be
				destroyed before timer expires

		dwork->cpu = cpu;
		timer->expires = jiffies + delay;

		if (unlikely(cpu != WORK_CPU_UNBOUND))
			add_timer_on(timer, cpu);
		else
			add_timer(timer);
	}

In ucsi_unregister() we destroy the connector's wq, but there is a
pending timer still for the ucsi_check_connection() item and upon
expiry it tries to do the real __queue_work() call on a dangling 'wq'.

> > Fix this by moving all partner related delayed work to connector instance
> > and cancel all of them when ucsi_unregister() is called by PPM client.
> 
> I would love to be able to cancel these works, though not because of
> driver removal - I'm more concerned about disconnections. The reason
> why each task is a unique work is because it allows the driver to add
> the same task to the queue as many times as needed, and that was
> needed inorder to recover from some firmware issues. If there's only a
> dedicated work per task like in your proposal, we can only schedule
> the task once at a time, and that may lead into other issues.

I see, queuing a task multiple times is a good reason to allocate the
workers dynamically.  Then what we really need is a way to reliably
cancel & reclaim any pending items that are sitting on their own timers,
since they are otherwise unreachable via the 'wq'. 

cancel_delayed_work(), cancel_delayed_work_sync(), flush_delayed_work()
all require the handle to the delayed_work itself which we don't keep a
reference to.

Do you have any other suggestions for this?

Thanks,
Jack

Heikki Krogerus Jan. 9, 2023, 11:49 a.m. UTC | #3

Hi,

On Thu, Jan 05, 2023 at 10:34:41AM -0800, Jack Pham wrote:
> On Thu, Jan 05, 2023 at 01:27:07PM +0200, Heikki Krogerus wrote:
> > On Thu, Jan 05, 2023 at 01:42:40PM +0800, Linyu Yuan wrote:
> > > When a PPM client unregisters with UCSI framework, connector specific work
> > > queue is destroyed. However, a pending delayed work queued before may
> > > still exist. Once the delay timer expires and the work is scheduled,
> > > this can cause a system crash as the workqueue is destroyed already.
> > 
> > When the workqueue is destroyed it's also flushed, no? So how could
> > there be still something pending, delayed or not? Did you actually see
> > this happening?
> 
> Yes, we encountered this during a scenario in which our PPM firmware 
> is undergoing a reset which is handled by calling ucsi_unregister().
> The connectors' workqueues are destroyed but apparently the
> destroy_workqueue() does *not* seem to take care pending delayed items.
> 
> 	Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> 	...
> 	Call trace:
> 	 __queue_work+0x1f4/0x550
> 	 delayed_work_timer_fn+0x28/0x38
> 	 call_timer_fn+0x3c/0x238
> 	 expire_timers+0xcc/0x168
> 	 __run_timers+0x194/0x1f8
> 	 run_timer_softirq+0x2c/0x54
> 	 _stext+0xec/0x3a8
> 	 __irq_exit_rcu+0xa0/0xfc
> 	 irq_exit_rcu+0x18/0x28
> 	 el0_interrupt+0x5c/0x138
> 	 __el0_irq_handler_common+0x20/0x30
> 	 el0t_64_irq_handler+0x18/0x28
> 	 el0t_64_irq+0x1a0/0x1a4
> 	Code: eb16013f 54000300 aa1a03e0 9441be2a (f9400280) 
> 
> In particular this is happening for the ucsi_check_connection() which is
> the currently the only work item using a non-zero delay.  If we look
> closely at queue_delayed_work() we can see in that case it defers by
> using a separate timer:
> 
> 	static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
> 					struct delayed_work *dwork, unsigned long delay)
> 	{
> 		struct timer_list *timer = &dwork->timer;
> 		struct work_struct *work = &dwork->work;
> 
> 		<...snip...>
> 
> 		/*
> 		 * If @delay is 0, queue @dwork->work immediately.  This is for
> 		 * both optimization and correctness.  The earliest @timer can
> 		 * expire is on the closest next tick and delayed_work users depend
> 		 * on that there's no such delay when @delay is 0.
> 		 */
> 		if (!delay) {
> 			__queue_work(cpu, wq, &dwork->work);
> 			return;
> 		}
> 
> 		dwork->wq = wq;
> 		       ^^^^^^^^ wq gets saved here, but might be
> 				destroyed before timer expires
> 
> 		dwork->cpu = cpu;
> 		timer->expires = jiffies + delay;
> 
> 		if (unlikely(cpu != WORK_CPU_UNBOUND))
> 			add_timer_on(timer, cpu);
> 		else
> 			add_timer(timer);
> 	}
> 
> In ucsi_unregister() we destroy the connector's wq, but there is a
> pending timer still for the ucsi_check_connection() item and upon
> expiry it tries to do the real __queue_work() call on a dangling 'wq'.

Okay.

> > > Fix this by moving all partner related delayed work to connector instance
> > > and cancel all of them when ucsi_unregister() is called by PPM client.
> > 
> > I would love to be able to cancel these works, though not because of
> > driver removal - I'm more concerned about disconnections. The reason
> > why each task is a unique work is because it allows the driver to add
> > the same task to the queue as many times as needed, and that was
> > needed inorder to recover from some firmware issues. If there's only a
> > dedicated work per task like in your proposal, we can only schedule
> > the task once at a time, and that may lead into other issues.
> 
> I see, queuing a task multiple times is a good reason to allocate the
> workers dynamically.  Then what we really need is a way to reliably
> cancel & reclaim any pending items that are sitting on their own timers,
> since they are otherwise unreachable via the 'wq'. 
> 
> cancel_delayed_work(), cancel_delayed_work_sync(), flush_delayed_work()
> all require the handle to the delayed_work itself which we don't keep a
> reference to.
> 
> Do you have any other suggestions for this?

How about separate list for the works?

diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 8695eb2c6227e..d5cf7573a2cfa 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -187,6 +187,7 @@ EXPORT_SYMBOL_GPL(ucsi_send_command);
 
 struct ucsi_work {
        struct delayed_work work;
+       struct list_head node;
        unsigned long delay;
        unsigned int count;
        struct ucsi_connector *con;
@@ -202,6 +203,7 @@ static void ucsi_poll_worker(struct work_struct *work)
        mutex_lock(&con->lock);
 
        if (!con->partner) {
+               list_del(&uwork->node);
                mutex_unlock(&con->lock);
                kfree(uwork);
                return;
@@ -209,10 +211,12 @@ static void ucsi_poll_worker(struct work_struct *work)
 
        ret = uwork->cb(con);
 
-       if (uwork->count-- && (ret == -EBUSY || ret == -ETIMEDOUT))
+       if (uwork->count-- && (ret == -EBUSY || ret == -ETIMEDOUT)) {
                queue_delayed_work(con->wq, &uwork->work, uwork->delay);
-       else
+       } else {
+               list_del(&uwork->node);
                kfree(uwork);
+       }
 
        mutex_unlock(&con->lock);
 }
@@ -236,6 +240,7 @@ static int ucsi_partner_task(struct ucsi_connector *con,
        uwork->con = con;
        uwork->cb = cb;
 
+       list_add_tail(&uwork->node, &con->works);
        queue_delayed_work(con->wq, &uwork->work, delay);
 
        return 0;
@@ -1056,6 +1061,7 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
        INIT_WORK(&con->work, ucsi_handle_connector_change);
        init_completion(&con->complete);
        mutex_init(&con->lock);
+       INIT_LIST_HEAD(&con->works);
        con->num = index + 1;
        con->ucsi = ucsi;
 
diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h
index 8361e1cfc8eba..dcb792eeedb94 100644
--- a/drivers/usb/typec/ucsi/ucsi.h
+++ b/drivers/usb/typec/ucsi/ucsi.h
@@ -325,6 +325,7 @@ struct ucsi_connector {
        struct work_struct work;
        struct completion complete;
        struct workqueue_struct *wq;
+       struct list_head works;
 
        struct typec_port *port;
        struct typec_partner *partner;


Something like that. Then just walk through the list and cancel each
work in it before destroying the wq. Would that work?

thanks,

Jack Pham Jan. 9, 2023, 8:46 p.m. UTC | #4

On Mon, Jan 09, 2023 at 01:49:22PM +0200, Heikki Krogerus wrote:
> Hi,
> 
> On Thu, Jan 05, 2023 at 10:34:41AM -0800, Jack Pham wrote:
> > On Thu, Jan 05, 2023 at 01:27:07PM +0200, Heikki Krogerus wrote:
> > > On Thu, Jan 05, 2023 at 01:42:40PM +0800, Linyu Yuan wrote:
> > > > When a PPM client unregisters with UCSI framework, connector specific work
> > > > queue is destroyed. However, a pending delayed work queued before may
> > > > still exist. Once the delay timer expires and the work is scheduled,
> > > > this can cause a system crash as the workqueue is destroyed already.
> > > 
> > > When the workqueue is destroyed it's also flushed, no? So how could
> > > there be still something pending, delayed or not? Did you actually see
> > > this happening?
> > 
> > Yes, we encountered this during a scenario in which our PPM firmware 
> > is undergoing a reset which is handled by calling ucsi_unregister().
> > The connectors' workqueues are destroyed but apparently the
> > destroy_workqueue() does *not* seem to take care pending delayed items.
> > 
> > 	Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> > 	...
> > 	Call trace:
> > 	 __queue_work+0x1f4/0x550
> > 	 delayed_work_timer_fn+0x28/0x38
> > 	 call_timer_fn+0x3c/0x238
> > 	 expire_timers+0xcc/0x168
> > 	 __run_timers+0x194/0x1f8
> > 	 run_timer_softirq+0x2c/0x54
> > 	 _stext+0xec/0x3a8
> > 	 __irq_exit_rcu+0xa0/0xfc
> > 	 irq_exit_rcu+0x18/0x28
> > 	 el0_interrupt+0x5c/0x138
> > 	 __el0_irq_handler_common+0x20/0x30
> > 	 el0t_64_irq_handler+0x18/0x28
> > 	 el0t_64_irq+0x1a0/0x1a4
> > 	Code: eb16013f 54000300 aa1a03e0 9441be2a (f9400280) 
> > 
> > In particular this is happening for the ucsi_check_connection() which is
> > the currently the only work item using a non-zero delay.  If we look
> > closely at queue_delayed_work() we can see in that case it defers by
> > using a separate timer:
> > 
> > 	static void __queue_delayed_work(int cpu, struct workqueue_struct *wq,
> > 					struct delayed_work *dwork, unsigned long delay)
> > 	{
> > 		struct timer_list *timer = &dwork->timer;
> > 		struct work_struct *work = &dwork->work;
> > 
> > 		<...snip...>
> > 
> > 		/*
> > 		 * If @delay is 0, queue @dwork->work immediately.  This is for
> > 		 * both optimization and correctness.  The earliest @timer can
> > 		 * expire is on the closest next tick and delayed_work users depend
> > 		 * on that there's no such delay when @delay is 0.
> > 		 */
> > 		if (!delay) {
> > 			__queue_work(cpu, wq, &dwork->work);
> > 			return;
> > 		}
> > 
> > 		dwork->wq = wq;
> > 		       ^^^^^^^^ wq gets saved here, but might be
> > 				destroyed before timer expires
> > 
> > 		dwork->cpu = cpu;
> > 		timer->expires = jiffies + delay;
> > 
> > 		if (unlikely(cpu != WORK_CPU_UNBOUND))
> > 			add_timer_on(timer, cpu);
> > 		else
> > 			add_timer(timer);
> > 	}
> > 
> > In ucsi_unregister() we destroy the connector's wq, but there is a
> > pending timer still for the ucsi_check_connection() item and upon
> > expiry it tries to do the real __queue_work() call on a dangling 'wq'.
> 
> Okay.
> 
> > > > Fix this by moving all partner related delayed work to connector instance
> > > > and cancel all of them when ucsi_unregister() is called by PPM client.
> > > 
> > > I would love to be able to cancel these works, though not because of
> > > driver removal - I'm more concerned about disconnections. The reason
> > > why each task is a unique work is because it allows the driver to add
> > > the same task to the queue as many times as needed, and that was
> > > needed inorder to recover from some firmware issues. If there's only a
> > > dedicated work per task like in your proposal, we can only schedule
> > > the task once at a time, and that may lead into other issues.
> > 
> > I see, queuing a task multiple times is a good reason to allocate the
> > workers dynamically.  Then what we really need is a way to reliably
> > cancel & reclaim any pending items that are sitting on their own timers,
> > since they are otherwise unreachable via the 'wq'. 
> > 
> > cancel_delayed_work(), cancel_delayed_work_sync(), flush_delayed_work()
> > all require the handle to the delayed_work itself which we don't keep a
> > reference to.
> > 
> > Do you have any other suggestions for this?
> 
> How about separate list for the works?
> 
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index 8695eb2c6227e..d5cf7573a2cfa 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -187,6 +187,7 @@ EXPORT_SYMBOL_GPL(ucsi_send_command);
>  
>  struct ucsi_work {
>         struct delayed_work work;
> +       struct list_head node;
>         unsigned long delay;
>         unsigned int count;
>         struct ucsi_connector *con;
> @@ -202,6 +203,7 @@ static void ucsi_poll_worker(struct work_struct *work)
>         mutex_lock(&con->lock);
>  
>         if (!con->partner) {
> +               list_del(&uwork->node);
>                 mutex_unlock(&con->lock);
>                 kfree(uwork);
>                 return;
> @@ -209,10 +211,12 @@ static void ucsi_poll_worker(struct work_struct *work)
>  
>         ret = uwork->cb(con);
>  
> -       if (uwork->count-- && (ret == -EBUSY || ret == -ETIMEDOUT))
> +       if (uwork->count-- && (ret == -EBUSY || ret == -ETIMEDOUT)) {
>                 queue_delayed_work(con->wq, &uwork->work, uwork->delay);
> -       else
> +       } else {
> +               list_del(&uwork->node);
>                 kfree(uwork);
> +       }
>  
>         mutex_unlock(&con->lock);
>  }
> @@ -236,6 +240,7 @@ static int ucsi_partner_task(struct ucsi_connector *con,
>         uwork->con = con;
>         uwork->cb = cb;
>  
> +       list_add_tail(&uwork->node, &con->works);
>         queue_delayed_work(con->wq, &uwork->work, delay);
>  
>         return 0;
> @@ -1056,6 +1061,7 @@ static int ucsi_register_port(struct ucsi *ucsi, int index)
>         INIT_WORK(&con->work, ucsi_handle_connector_change);
>         init_completion(&con->complete);
>         mutex_init(&con->lock);
> +       INIT_LIST_HEAD(&con->works);
>         con->num = index + 1;
>         con->ucsi = ucsi;
>  
> diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h
> index 8361e1cfc8eba..dcb792eeedb94 100644
> --- a/drivers/usb/typec/ucsi/ucsi.h
> +++ b/drivers/usb/typec/ucsi/ucsi.h
> @@ -325,6 +325,7 @@ struct ucsi_connector {
>         struct work_struct work;
>         struct completion complete;
>         struct workqueue_struct *wq;
> +       struct list_head works;
>  
>         struct typec_port *port;
>         struct typec_partner *partner;
> 
> 
> Something like that. Then just walk through the list and cancel each
> work in it before destroying the wq. Would that work?

Thanks Heikki for the suggestion!  I think it should work (plus the
actual list walk in ucsi_unregister).  We will give it a try and will
send a v2 if it works out.

Jack

diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index eabe519..f6b23e9 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -185,14 +185,6 @@  EXPORT_SYMBOL_GPL(ucsi_send_command);
 
 /* -------------------------------------------------------------------------- */
 
-struct ucsi_work {
-	struct delayed_work work;
-	unsigned long delay;
-	unsigned int count;
-	struct ucsi_connector *con;
-	int (*cb)(struct ucsi_connector *);
-};
-
 static void ucsi_poll_worker(struct work_struct *work)
 {
 	struct ucsi_work *uwork = container_of(work, struct ucsi_work, work.work);
@@ -203,7 +195,6 @@  static void ucsi_poll_worker(struct work_struct *work)
 
 	if (!con->partner) {
 		mutex_unlock(&con->lock);
-		kfree(uwork);
 		return;
 	}
 
@@ -211,30 +202,20 @@  static void ucsi_poll_worker(struct work_struct *work)
 
 	if (uwork->count-- && (ret == -EBUSY || ret == -ETIMEDOUT))
 		queue_delayed_work(con->wq, &uwork->work, uwork->delay);
-	else
-		kfree(uwork);
 
 	mutex_unlock(&con->lock);
 }
 
-static int ucsi_partner_task(struct ucsi_connector *con,
-			     int (*cb)(struct ucsi_connector *),
+static int ucsi_partner_task(struct ucsi_work *uwork,
 			     int retries, unsigned long delay)
 {
-	struct ucsi_work *uwork;
+	struct ucsi_connector *con = uwork->con;
 
 	if (!con->partner)
 		return 0;
 
-	uwork = kzalloc(sizeof(*uwork), GFP_KERNEL);
-	if (!uwork)
-		return -ENOMEM;
-
-	INIT_DELAYED_WORK(&uwork->work, ucsi_poll_worker);
 	uwork->count = retries;
 	uwork->delay = delay;
-	uwork->con = con;
-	uwork->cb = cb;
 
 	queue_delayed_work(con->wq, &uwork->work, delay);
 
@@ -636,8 +617,8 @@  static void ucsi_pwr_opmode_change(struct ucsi_connector *con)
 	case UCSI_CONSTAT_PWR_OPMODE_PD:
 		con->rdo = con->status.request_data_obj;
 		typec_set_pwr_opmode(con->port, TYPEC_PWR_MODE_PD);
-		ucsi_partner_task(con, ucsi_get_src_pdos, 30, 0);
-		ucsi_partner_task(con, ucsi_check_altmodes, 30, 0);
+		ucsi_partner_task(&con->pdos_work, 30, 0);
+		ucsi_partner_task(&con->altmodes_work, 30, 0);
 		break;
 	case UCSI_CONSTAT_PWR_OPMODE_TYPEC1_5:
 		con->rdo = 0;
@@ -799,7 +780,7 @@  static void ucsi_handle_connector_change(struct work_struct *work)
 
 		if (con->status.flags & UCSI_CONSTAT_CONNECTED) {
 			ucsi_register_partner(con);
-			ucsi_partner_task(con, ucsi_check_connection, 1, HZ);
+			ucsi_partner_task(&con->connection_work, 1, HZ);
 		} else {
 			ucsi_unregister_partner(con);
 		}
@@ -818,7 +799,7 @@  static void ucsi_handle_connector_change(struct work_struct *work)
 	}
 
 	if (con->status.change & UCSI_CONSTAT_CAM_CHANGE)
-		ucsi_partner_task(con, ucsi_check_altmodes, 1, 0);
+		ucsi_partner_task(&con->altmodes_work, 1, 0);
 
 	clear_bit(EVENT_PENDING, &con->ucsi->flags);
 
@@ -1034,6 +1015,21 @@  static struct fwnode_handle *ucsi_find_fwnode(struct ucsi_connector *con)
 	return NULL;
 }
 
+static void ucsi_partner_task_init(struct ucsi_connector *con)
+{
+	INIT_DELAYED_WORK(&con->connection_work.work, ucsi_poll_worker);
+	con->connection_work.cb = ucsi_check_connection;
+	con->connection_work.con = con;
+
+	INIT_DELAYED_WORK(&con->pdos_work.work, ucsi_poll_worker);
+	con->pdos_work.cb = ucsi_get_src_pdos;
+	con->pdos_work.con = con;
+
+	INIT_DELAYED_WORK(&con->altmodes_work.work, ucsi_poll_worker);
+	con->altmodes_work.cb = ucsi_check_altmodes;
+	con->altmodes_work.con = con;
+}
+
 static int ucsi_register_port(struct ucsi *ucsi, int index)
 {
 	struct ucsi_connector *con = &ucsi->connector[index];
@@ -1053,6 +1049,7 @@  static int ucsi_register_port(struct ucsi *ucsi, int index)
 	if (!con->wq)
 		return -ENOMEM;
 
+	ucsi_partner_task_init(con);
 	INIT_WORK(&con->work, ucsi_handle_connector_change);
 	init_completion(&con->complete);
 	mutex_init(&con->lock);
@@ -1287,7 +1284,7 @@  static void ucsi_resume_work(struct work_struct *work)
 
 	for (con = ucsi->connector; con->port; con++) {
 		mutex_lock(&con->lock);
-		ucsi_partner_task(con, ucsi_check_connection, 1, 0);
+		ucsi_partner_task(&con->connection_work, 1, 0);
 		mutex_unlock(&con->lock);
 	}
 }
@@ -1396,6 +1393,17 @@  int ucsi_register(struct ucsi *ucsi)
 }
 EXPORT_SYMBOL_GPL(ucsi_register);
 
+static void ucsi_partner_task_destroy(struct ucsi_connector *con)
+{
+	mutex_lock(&con->lock);
+
+	cancel_delayed_work_sync(&con->connection_work.work);
+	cancel_delayed_work_sync(&con->pdos_work.work);
+	cancel_delayed_work_sync(&con->altmodes_work.work);
+
+	mutex_unlock(&con->lock);
+}
+
 /**
  * ucsi_unregister - Unregister UCSI interface
  * @ucsi: UCSI interface to be unregistered
@@ -1420,6 +1428,7 @@  void ucsi_unregister(struct ucsi *ucsi)
 		ucsi_unregister_altmodes(&ucsi->connector[i],
 					 UCSI_RECIPIENT_CON);
 		ucsi_unregister_port_psy(&ucsi->connector[i]);
+		ucsi_partner_task_destroy(&ucsi->connector[i]);
 		if (ucsi->connector[i].wq)
 			destroy_workqueue(ucsi->connector[i].wq);
 		typec_unregister_port(ucsi->connector[i].port);
diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h
index c968474..40d39d2 100644
--- a/drivers/usb/typec/ucsi/ucsi.h
+++ b/drivers/usb/typec/ucsi/ucsi.h
@@ -314,6 +314,14 @@  struct ucsi {
 #define UCSI_TYPEC_1_5_CURRENT	1500
 #define UCSI_TYPEC_3_0_CURRENT	3000
 
+struct ucsi_work {
+	struct delayed_work work;
+	unsigned long delay;
+	unsigned int count;
+	struct ucsi_connector *con;
+	int (*cb)(struct ucsi_connector *con);
+};
+
 struct ucsi_connector {
 	int num;
 
@@ -322,6 +330,9 @@  struct ucsi_connector {
 	struct work_struct work;
 	struct completion complete;
 	struct workqueue_struct *wq;
+	struct ucsi_work connection_work;
+	struct ucsi_work pdos_work;
+	struct ucsi_work altmodes_work;
 
 	struct typec_port *port;
 	struct typec_partner *partner;

[v2] usb: ucsi: fix connector partner ucsi work issue

Commit Message

Comments

Patch