Message ID | 20240819-pmic-glink-v6-11-races-v2-0-88fe3ab1f0e2@quicinc.com (mailing list archive) |
---|---|
Headers | show |
Series | soc: qcom: pmic_glink: v6.11-rc bug fixes | expand |
On Mon, Aug 19, 2024 at 01:07:44PM -0700, Bjorn Andersson wrote: > Amit and Johan both reported a NULL pointer dereference in the > pmic_glink client code during initialization, and Stephen Boyd pointed > out the problem (race condition). > > While investigating, and writing the fix, I noticed that > ucsi_unregister() is called in atomic context but tries to sleep, and I > also noticed that the condition for when to inform the pmic_glink client > drivers when the remote has gone down is just wrong. > > So, let's fix all three. > Changes in v2: > - Refer to the correct commit in the ucsi_unregister() patch. > - Updated wording in the same commit message about the new error message > in the log. > - Changed the data type of the introduced state variables, opted to go > for a bool as we only represent two states (and I would like to > further clean this up going forward) > - Initialized the spinlock > - Link to v1: https://lore.kernel.org/r/20240818-pmic-glink-v6-11-races-v1-0-f87c577e0bc9@quicinc.com > > --- > Bjorn Andersson (3): > soc: qcom: pmic_glink: Fix race during initialization > usb: typec: ucsi: Move unregister out of atomic section > soc: qcom: pmic_glink: Actually communicate with remote goes down Tested-by: Johan Hovold <johan+linaro@kernel.org>
Amit and Johan both reported a NULL pointer dereference in the pmic_glink client code during initialization, and Stephen Boyd pointed out the problem (race condition). While investigating, and writing the fix, I noticed that ucsi_unregister() is called in atomic context but tries to sleep, and I also noticed that the condition for when to inform the pmic_glink client drivers when the remote has gone down is just wrong. So, let's fix all three. As mentioned in the commit message for the UCSI fix, I have a series in the works that makes the GLINK callback happen in a sleepable context, which would remove the need for the clients list to be protected by a spinlock, and removing the work scheduling. This is however not -rc material... In addition to the NULL pointer dereference, there is the -ECANCELED issue reported here: https://lore.kernel.org/all/Zqet8iInnDhnxkT9@hovoldconsulting.com/ Johan reports that these fixes do not address that issue. Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> --- Changes in v2: - Refer to the correct commit in the ucsi_unregister() patch. - Updated wording in the same commit message about the new error message in the log. - Changed the data type of the introduced state variables, opted to go for a bool as we only represent two states (and I would like to further clean this up going forward) - Initialized the spinlock - Link to v1: https://lore.kernel.org/r/20240818-pmic-glink-v6-11-races-v1-0-f87c577e0bc9@quicinc.com --- Bjorn Andersson (3): soc: qcom: pmic_glink: Fix race during initialization usb: typec: ucsi: Move unregister out of atomic section soc: qcom: pmic_glink: Actually communicate with remote goes down drivers/power/supply/qcom_battmgr.c | 16 ++++++++----- drivers/soc/qcom/pmic_glink.c | 40 ++++++++++++++++++++++---------- drivers/soc/qcom/pmic_glink_altmode.c | 17 +++++++++----- drivers/usb/typec/ucsi/ucsi_glink.c | 43 ++++++++++++++++++++++++++--------- include/linux/soc/qcom/pmic_glink.h | 11 +++++---- 5 files changed, 87 insertions(+), 40 deletions(-) --- base-commit: 2fd613d27928293eaa87788b10e8befb6805cd42 change-id: 20240818-pmic-glink-v6-11-races-363f5964c339 Best regards,