mbox series

[v2,0/3] soc: qcom: pmic_glink: v6.11-rc bug fixes

Message ID 20240819-pmic-glink-v6-11-races-v2-0-88fe3ab1f0e2@quicinc.com (mailing list archive)
Headers show
Series soc: qcom: pmic_glink: v6.11-rc bug fixes | expand

Message

Bjorn Andersson Aug. 19, 2024, 8:07 p.m. UTC
Amit and Johan both reported a NULL pointer dereference in the
pmic_glink client code during initialization, and Stephen Boyd pointed
out the problem (race condition).

While investigating, and writing the fix, I noticed that
ucsi_unregister() is called in atomic context but tries to sleep, and I
also noticed that the condition for when to inform the pmic_glink client
drivers when the remote has gone down is just wrong.

So, let's fix all three.

As mentioned in the commit message for the UCSI fix, I have a series in
the works that makes the GLINK callback happen in a sleepable context,
which would remove the need for the clients list to be protected by a
spinlock, and removing the work scheduling. This is however not -rc
material...

In addition to the NULL pointer dereference, there is the -ECANCELED
issue reported here:
https://lore.kernel.org/all/Zqet8iInnDhnxkT9@hovoldconsulting.com/
Johan reports that these fixes do not address that issue.

Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
---
Changes in v2:
- Refer to the correct commit in the ucsi_unregister() patch.
- Updated wording in the same commit message about the new error message
  in the log.
- Changed the data type of the introduced state variables, opted to go
  for a bool as we only represent two states (and I would like to
  further clean this up going forward)
- Initialized the spinlock
- Link to v1: https://lore.kernel.org/r/20240818-pmic-glink-v6-11-races-v1-0-f87c577e0bc9@quicinc.com

---
Bjorn Andersson (3):
      soc: qcom: pmic_glink: Fix race during initialization
      usb: typec: ucsi: Move unregister out of atomic section
      soc: qcom: pmic_glink: Actually communicate with remote goes down

 drivers/power/supply/qcom_battmgr.c   | 16 ++++++++-----
 drivers/soc/qcom/pmic_glink.c         | 40 ++++++++++++++++++++++----------
 drivers/soc/qcom/pmic_glink_altmode.c | 17 +++++++++-----
 drivers/usb/typec/ucsi/ucsi_glink.c   | 43 ++++++++++++++++++++++++++---------
 include/linux/soc/qcom/pmic_glink.h   | 11 +++++----
 5 files changed, 87 insertions(+), 40 deletions(-)
---
base-commit: 2fd613d27928293eaa87788b10e8befb6805cd42
change-id: 20240818-pmic-glink-v6-11-races-363f5964c339

Best regards,

Comments

Johan Hovold Aug. 20, 2024, 7:12 a.m. UTC | #1
On Mon, Aug 19, 2024 at 01:07:44PM -0700, Bjorn Andersson wrote:
> Amit and Johan both reported a NULL pointer dereference in the
> pmic_glink client code during initialization, and Stephen Boyd pointed
> out the problem (race condition).
> 
> While investigating, and writing the fix, I noticed that
> ucsi_unregister() is called in atomic context but tries to sleep, and I
> also noticed that the condition for when to inform the pmic_glink client
> drivers when the remote has gone down is just wrong.
> 
> So, let's fix all three.

> Changes in v2:
> - Refer to the correct commit in the ucsi_unregister() patch.
> - Updated wording in the same commit message about the new error message
>   in the log.
> - Changed the data type of the introduced state variables, opted to go
>   for a bool as we only represent two states (and I would like to
>   further clean this up going forward)
> - Initialized the spinlock
> - Link to v1: https://lore.kernel.org/r/20240818-pmic-glink-v6-11-races-v1-0-f87c577e0bc9@quicinc.com
> 
> ---
> Bjorn Andersson (3):
>       soc: qcom: pmic_glink: Fix race during initialization
>       usb: typec: ucsi: Move unregister out of atomic section
>       soc: qcom: pmic_glink: Actually communicate with remote goes down

Tested-by: Johan Hovold <johan+linaro@kernel.org>