Message ID | 20240121204123.275441-2-lk@c--e.de (mailing list archive) |
---|---|
State | Accepted |
Commit | c9aed03a0a683fd1600ea92f2ad32232d4736272 |
Headers | show |
Series | UCSI fixes | expand |
On Sun, Jan 21, 2024 at 09:41:21PM +0100, Christian A. Ehrhardt wrote: > Calling ->sync_write must be done while holding the PPM lock as > the mailbox logic does not support concurrent commands. > > At least since the addition of partner task this means that > ucsi_acknowledge_connector_change should be called with the > PPM lock held as it calls ->sync_write. > > Thus protect the only call to ucsi_acknowledge_connector_change > with the PPM. All other calls to ->sync_write already happen > under the PPM lock. > > Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking") > Cc: stable@vger.kernel.org > Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> > --- > drivers/usb/typec/ucsi/ucsi.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > index 61b64558f96c..8f9dff993b3d 100644 > --- a/drivers/usb/typec/ucsi/ucsi.c > +++ b/drivers/usb/typec/ucsi/ucsi.c > @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work) > > clear_bit(EVENT_PENDING, &con->ucsi->flags); > > + mutex_lock(&ucsi->ppm_lock); > ret = ucsi_acknowledge_connector_change(ucsi); > + mutex_unlock(&ucsi->ppm_lock); > if (ret) > dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret); Is this really actually fixing some issue? The function has already taken the connector lock, so what exactly could happen without this? thanks,
Hi Heikki, Thanks for looking into this. On Wed, Jan 24, 2024 at 10:09:04AM +0200, Heikki Krogerus wrote: > On Sun, Jan 21, 2024 at 09:41:21PM +0100, Christian A. Ehrhardt wrote: > > Calling ->sync_write must be done while holding the PPM lock as > > the mailbox logic does not support concurrent commands. > > > > At least since the addition of partner task this means that > > ucsi_acknowledge_connector_change should be called with the > > PPM lock held as it calls ->sync_write. > > > > Thus protect the only call to ucsi_acknowledge_connector_change > > with the PPM. All other calls to ->sync_write already happen > > under the PPM lock. > > > > Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking") > > Cc: stable@vger.kernel.org > > Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> > > --- > > drivers/usb/typec/ucsi/ucsi.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > > index 61b64558f96c..8f9dff993b3d 100644 > > --- a/drivers/usb/typec/ucsi/ucsi.c > > +++ b/drivers/usb/typec/ucsi/ucsi.c > > @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work) > > > > clear_bit(EVENT_PENDING, &con->ucsi->flags); > > > > + mutex_lock(&ucsi->ppm_lock); > > ret = ucsi_acknowledge_connector_change(ucsi); > > + mutex_unlock(&ucsi->ppm_lock); > > if (ret) > > dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret); > > Is this really actually fixing some issue? The function has already > taken the connector lock, so what exactly could happen without this? I've definitely _seen_ issues with the quirk from PATCH 3/3 if the lock is missing. I'm pretty sure from looking at the code that races with other connectors can happen without the quirk, too. The PPM lock protects the Mailbox and the ACK_PENDING/COMMAND_PENDING bits and I could observe cases where ucsi_acpi_sync_write() was entered with the COMMAND_PENDING bit already set. One possible sequence is this: ucsi_connector_change() for connector #1 => schedules partner tasks => Acks the connector change => Releases locks ucsi_connecotr_change() for connector #2 => acquire con lock for #2 => Start to process connector state change => Processing reaches the point right before ucsi_acknowledge_connector_change() Connector #1 workqueue starts to process one of the partner tasks => Acquire con lock for connector #1 => Acquire ppm lock => Enter ucsi_exec_command() => ->sync_write() starts to use the mailbox and sets COMMAND_PENDING => ->sync_write blocks waiting for the command completion with wait_for_completion_timeout(). ucsi_connector_change() for connector #2 continues => execute ucsi_acknowledge_connector_change() and start using the mailbox that is still in use. => BOOM There is a simpler an much more likely sequence with the dell quirk active. regards Christian
On Wed, Jan 24, 2024 at 01:04:06PM +0100, Christian A. Ehrhardt wrote: > > Hi Heikki, > > Thanks for looking into this. > > On Wed, Jan 24, 2024 at 10:09:04AM +0200, Heikki Krogerus wrote: > > On Sun, Jan 21, 2024 at 09:41:21PM +0100, Christian A. Ehrhardt wrote: > > > Calling ->sync_write must be done while holding the PPM lock as > > > the mailbox logic does not support concurrent commands. > > > > > > At least since the addition of partner task this means that > > > ucsi_acknowledge_connector_change should be called with the > > > PPM lock held as it calls ->sync_write. > > > > > > Thus protect the only call to ucsi_acknowledge_connector_change > > > with the PPM. All other calls to ->sync_write already happen > > > under the PPM lock. > > > > > > Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking") > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> > > > --- > > > drivers/usb/typec/ucsi/ucsi.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > > > index 61b64558f96c..8f9dff993b3d 100644 > > > --- a/drivers/usb/typec/ucsi/ucsi.c > > > +++ b/drivers/usb/typec/ucsi/ucsi.c > > > @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work) > > > > > > clear_bit(EVENT_PENDING, &con->ucsi->flags); > > > > > > + mutex_lock(&ucsi->ppm_lock); > > > ret = ucsi_acknowledge_connector_change(ucsi); > > > + mutex_unlock(&ucsi->ppm_lock); > > > if (ret) > > > dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret); > > > > Is this really actually fixing some issue? The function has already > > taken the connector lock, so what exactly could happen without this? > > I've definitely _seen_ issues with the quirk from PATCH 3/3 if the > lock is missing. I'm pretty sure from looking at the code that races > with other connectors can happen without the quirk, too. > > The PPM lock protects the Mailbox and the ACK_PENDING/COMMAND_PENDING > bits and I could observe cases where ucsi_acpi_sync_write() was entered > with the COMMAND_PENDING bit already set. One possible sequence is this: > > ucsi_connector_change() for connector #1 > => schedules partner tasks > => Acks the connector change > => Releases locks > ucsi_connecotr_change() for connector #2 > => acquire con lock for #2 > => Start to process connector state change > => Processing reaches the point right before > ucsi_acknowledge_connector_change() > Connector #1 workqueue starts to process one of the partner tasks > => Acquire con lock for connector #1 > => Acquire ppm lock > => Enter ucsi_exec_command() > => ->sync_write() starts to use the mailbox and sets > COMMAND_PENDING > => ->sync_write blocks waiting for the command completion > with wait_for_completion_timeout(). > ucsi_connector_change() for connector #2 continues > => execute ucsi_acknowledge_connector_change() and start using > the mailbox that is still in use. > => BOOM > > There is a simpler an much more likely sequence with the dell quirk active. Okay. Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 61b64558f96c..8f9dff993b3d 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work) clear_bit(EVENT_PENDING, &con->ucsi->flags); + mutex_lock(&ucsi->ppm_lock); ret = ucsi_acknowledge_connector_change(ucsi); + mutex_unlock(&ucsi->ppm_lock); if (ret) dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);
Calling ->sync_write must be done while holding the PPM lock as the mailbox logic does not support concurrent commands. At least since the addition of partner task this means that ucsi_acknowledge_connector_change should be called with the PPM lock held as it calls ->sync_write. Thus protect the only call to ucsi_acknowledge_connector_change with the PPM. All other calls to ->sync_write already happen under the PPM lock. Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking") Cc: stable@vger.kernel.org Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> --- drivers/usb/typec/ucsi/ucsi.c | 2 ++ 1 file changed, 2 insertions(+)