diff mbox series

[v3,1/3] usb: ucsi: Add missing ppm_lock

Message ID 20240121204123.275441-2-lk@c--e.de (mailing list archive)
State Accepted
Commit c9aed03a0a683fd1600ea92f2ad32232d4736272
Headers show
Series UCSI fixes | expand

Commit Message

Christian A. Ehrhardt Jan. 21, 2024, 8:41 p.m. UTC
Calling ->sync_write must be done while holding the PPM lock as
the mailbox logic does not support concurrent commands.

At least since the addition of partner task this means that
ucsi_acknowledge_connector_change should be called with the
PPM lock held as it calls ->sync_write.

Thus protect the only call to ucsi_acknowledge_connector_change
with the PPM. All other calls to ->sync_write already happen
under the PPM lock.

Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking")
Cc: stable@vger.kernel.org
Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
---
 drivers/usb/typec/ucsi/ucsi.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Heikki Krogerus Jan. 24, 2024, 8:09 a.m. UTC | #1
On Sun, Jan 21, 2024 at 09:41:21PM +0100, Christian A. Ehrhardt wrote:
> Calling ->sync_write must be done while holding the PPM lock as
> the mailbox logic does not support concurrent commands.
> 
> At least since the addition of partner task this means that
> ucsi_acknowledge_connector_change should be called with the
> PPM lock held as it calls ->sync_write.
> 
> Thus protect the only call to ucsi_acknowledge_connector_change
> with the PPM. All other calls to ->sync_write already happen
> under the PPM lock.
> 
> Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
> ---
>  drivers/usb/typec/ucsi/ucsi.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index 61b64558f96c..8f9dff993b3d 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work)
>  
>  	clear_bit(EVENT_PENDING, &con->ucsi->flags);
>  
> +	mutex_lock(&ucsi->ppm_lock);
>  	ret = ucsi_acknowledge_connector_change(ucsi);
> +	mutex_unlock(&ucsi->ppm_lock);
>  	if (ret)
>  		dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);

Is this really actually fixing some issue? The function has already
taken the connector lock, so what exactly could happen without this?

thanks,
Christian A. Ehrhardt Jan. 24, 2024, 12:04 p.m. UTC | #2
Hi Heikki,

Thanks for looking into this.

On Wed, Jan 24, 2024 at 10:09:04AM +0200, Heikki Krogerus wrote:
> On Sun, Jan 21, 2024 at 09:41:21PM +0100, Christian A. Ehrhardt wrote:
> > Calling ->sync_write must be done while holding the PPM lock as
> > the mailbox logic does not support concurrent commands.
> > 
> > At least since the addition of partner task this means that
> > ucsi_acknowledge_connector_change should be called with the
> > PPM lock held as it calls ->sync_write.
> > 
> > Thus protect the only call to ucsi_acknowledge_connector_change
> > with the PPM. All other calls to ->sync_write already happen
> > under the PPM lock.
> > 
> > Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
> > ---
> >  drivers/usb/typec/ucsi/ucsi.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> > index 61b64558f96c..8f9dff993b3d 100644
> > --- a/drivers/usb/typec/ucsi/ucsi.c
> > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work)
> >  
> >  	clear_bit(EVENT_PENDING, &con->ucsi->flags);
> >  
> > +	mutex_lock(&ucsi->ppm_lock);
> >  	ret = ucsi_acknowledge_connector_change(ucsi);
> > +	mutex_unlock(&ucsi->ppm_lock);
> >  	if (ret)
> >  		dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);
> 
> Is this really actually fixing some issue? The function has already
> taken the connector lock, so what exactly could happen without this?

I've definitely _seen_ issues with the quirk from PATCH 3/3 if the
lock is missing. I'm pretty sure from looking at the code that races
with other connectors can happen without the quirk, too.

The PPM lock protects the Mailbox and the ACK_PENDING/COMMAND_PENDING
bits and I could observe cases where ucsi_acpi_sync_write() was entered
with the COMMAND_PENDING bit already set. One possible sequence is this:

ucsi_connector_change() for connector #1
	=> schedules partner tasks
	=> Acks the connector change
	=> Releases locks
ucsi_connecotr_change() for connector #2
	=> acquire con lock for #2
	=> Start to process connector state change
	=> Processing reaches the point right before 
	   ucsi_acknowledge_connector_change()
Connector #1 workqueue starts to process one of the partner tasks
	=> Acquire con lock for connector #1
	=> Acquire ppm lock
	=> Enter ucsi_exec_command()
	=> ->sync_write() starts to use the mailbox and sets
	   COMMAND_PENDING
	=> ->sync_write blocks waiting for the command completion
	   with wait_for_completion_timeout().
ucsi_connector_change() for connector #2 continues
	=> execute ucsi_acknowledge_connector_change() and start using
	   the mailbox that is still in use.
	=> BOOM

There is a simpler an much more likely sequence with the dell quirk active.

     regards  Christian
Heikki Krogerus Jan. 24, 2024, 1:08 p.m. UTC | #3
On Wed, Jan 24, 2024 at 01:04:06PM +0100, Christian A. Ehrhardt wrote:
> 
> Hi Heikki,
> 
> Thanks for looking into this.
> 
> On Wed, Jan 24, 2024 at 10:09:04AM +0200, Heikki Krogerus wrote:
> > On Sun, Jan 21, 2024 at 09:41:21PM +0100, Christian A. Ehrhardt wrote:
> > > Calling ->sync_write must be done while holding the PPM lock as
> > > the mailbox logic does not support concurrent commands.
> > > 
> > > At least since the addition of partner task this means that
> > > ucsi_acknowledge_connector_change should be called with the
> > > PPM lock held as it calls ->sync_write.
> > > 
> > > Thus protect the only call to ucsi_acknowledge_connector_change
> > > with the PPM. All other calls to ->sync_write already happen
> > > under the PPM lock.
> > > 
> > > Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Christian A. Ehrhardt <lk@c--e.de>
> > > ---
> > >  drivers/usb/typec/ucsi/ucsi.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> > > index 61b64558f96c..8f9dff993b3d 100644
> > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work)
> > >  
> > >  	clear_bit(EVENT_PENDING, &con->ucsi->flags);
> > >  
> > > +	mutex_lock(&ucsi->ppm_lock);
> > >  	ret = ucsi_acknowledge_connector_change(ucsi);
> > > +	mutex_unlock(&ucsi->ppm_lock);
> > >  	if (ret)
> > >  		dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);
> > 
> > Is this really actually fixing some issue? The function has already
> > taken the connector lock, so what exactly could happen without this?
> 
> I've definitely _seen_ issues with the quirk from PATCH 3/3 if the
> lock is missing. I'm pretty sure from looking at the code that races
> with other connectors can happen without the quirk, too.
> 
> The PPM lock protects the Mailbox and the ACK_PENDING/COMMAND_PENDING
> bits and I could observe cases where ucsi_acpi_sync_write() was entered
> with the COMMAND_PENDING bit already set. One possible sequence is this:
> 
> ucsi_connector_change() for connector #1
> 	=> schedules partner tasks
> 	=> Acks the connector change
> 	=> Releases locks
> ucsi_connecotr_change() for connector #2
> 	=> acquire con lock for #2
> 	=> Start to process connector state change
> 	=> Processing reaches the point right before 
> 	   ucsi_acknowledge_connector_change()
> Connector #1 workqueue starts to process one of the partner tasks
> 	=> Acquire con lock for connector #1
> 	=> Acquire ppm lock
> 	=> Enter ucsi_exec_command()
> 	=> ->sync_write() starts to use the mailbox and sets
> 	   COMMAND_PENDING
> 	=> ->sync_write blocks waiting for the command completion
> 	   with wait_for_completion_timeout().
> ucsi_connector_change() for connector #2 continues
> 	=> execute ucsi_acknowledge_connector_change() and start using
> 	   the mailbox that is still in use.
> 	=> BOOM
> 
> There is a simpler an much more likely sequence with the dell quirk active.

Okay.

Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
diff mbox series

Patch

diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 61b64558f96c..8f9dff993b3d 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -935,7 +935,9 @@  static void ucsi_handle_connector_change(struct work_struct *work)
 
 	clear_bit(EVENT_PENDING, &con->ucsi->flags);
 
+	mutex_lock(&ucsi->ppm_lock);
 	ret = ucsi_acknowledge_connector_change(ucsi);
+	mutex_unlock(&ucsi->ppm_lock);
 	if (ret)
 		dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);