Message ID | 12456961.O9o76ZdvQC@kreacher (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | [v3] thermal: core: Do not fail cdev registration because of invalid initial state | expand |
On 06/06/2024 20:14, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Subject: [PATCH v3] thermal: core: Do not fail cdev registration because of invalid initial state > > It is reported that commit 31a0fa0019b0 ("thermal/debugfs: Pass cooling > device state to thermal_debug_cdev_add()") causes the ACPI fan driver > to fail probing on some systems which turns out to be due to the _FST > control method returning an invalid value until _FSL is first evaluated > for the given fan. If this happens, the .get_cur_state() cooling device > callback returns an error and __thermal_cooling_device_register() fails > as uses that callback after commit 31a0fa0019b0. > > Arguably, _FST should not return an invalid value even if it is > evaluated before _FSL, so this may be regarded as a platform firmware > issue, but at the same time it is not a good enough reason for failing > the cooling device registration where the initial cooling device state > is only needed to initialize a thermal debug facility. > > Accordingly, modify __thermal_cooling_device_register() to avoid > calling thermal_debug_cdev_add() instead of returning an error if the > initial .get_cur_state() callback invocation fails. > > Fixes: 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") > Closes: https://lore.kernel.org/linux-acpi/20240530153727.843378-1-laura.nao@collabora.com > Reported-by: Laura Nao <laura.nao@collabora.com> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
On 6/6/24 20:14, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Subject: [PATCH v3] thermal: core: Do not fail cdev registration > because of invalid initial state > > It is reported that commit 31a0fa0019b0 ("thermal/debugfs: Pass > cooling > device state to thermal_debug_cdev_add()") causes the ACPI fan driver > to fail probing on some systems which turns out to be due to the _FST > control method returning an invalid value until _FSL is first > evaluated > for the given fan. If this happens, the .get_cur_state() cooling > device > callback returns an error and __thermal_cooling_device_register() > fails > as uses that callback after commit 31a0fa0019b0. > > Arguably, _FST should not return an invalid value even if it is > evaluated before _FSL, so this may be regarded as a platform firmware > issue, but at the same time it is not a good enough reason for failing > the cooling device registration where the initial cooling device state > is only needed to initialize a thermal debug facility. > > Accordingly, modify __thermal_cooling_device_register() to avoid > calling thermal_debug_cdev_add() instead of returning an error if the > initial .get_cur_state() callback invocation fails. > > Fixes: 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to > thermal_debug_cdev_add()") > Closes: > https://lore.kernel.org/linux-acpi/20240530153727.843378-1-laura.nao@collabora.com > Reported-by: Laura Nao <laura.nao@collabora.com> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- Tested-by: Laura Nao <laura.nao@collabora.com> Thanks, Laura
Index: linux-pm/drivers/thermal/thermal_core.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_core.c +++ linux-pm/drivers/thermal/thermal_core.c @@ -999,9 +999,17 @@ __thermal_cooling_device_register(struct if (ret) goto out_cdev_type; + /* + * The cooling device's current state is only needed for debug + * initialization below, so a failure to get it does not cause + * the entire cooling device initialization to fail. However, + * the debug will not work for the device if its initial state + * cannot be determined and drivers are responsible for ensuring + * that this will not happen. + */ ret = cdev->ops->get_cur_state(cdev, ¤t_state); if (ret) - goto out_cdev_type; + current_state = ULONG_MAX; thermal_cooling_device_setup_sysfs(cdev); @@ -1016,7 +1024,8 @@ __thermal_cooling_device_register(struct return ERR_PTR(ret); } - thermal_debug_cdev_add(cdev, current_state); + if (current_state <= cdev->max_state) + thermal_debug_cdev_add(cdev, current_state); /* Add 'this' new cdev to the global cdev list */ mutex_lock(&thermal_list_lock);