Message ID | 20230201101559.15529-1-johan+linaro@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | interconnect: fix racy provider registration | expand |
On 01/02/2023 11:15, Johan Hovold wrote: > The current interconnect provider interface is inherently racy as > providers are expected to be registered before being fully initialised. > > This can specifically cause racing DT lookups to fail as I recently > noticed when the Qualcomm cpufreq driver failed to probe: > > of_icc_xlate_onecell: invalid index 0 > cpu cpu0: error -EINVAL: error finding src node > cpu cpu0: dev_pm_opp_of_find_icc_paths: Unable to get path0: -22 > qcom-cpufreq-hw: probe of 18591000.cpufreq failed with error -22 > > This only happens very rarely, but the bug is easily reproduced by > increasing the race window by adding an msleep() after registering > osm-l3 interconnect provider. > > Note that the Qualcomm cpufreq driver is especially susceptible to this > race as the interconnect path is looked up from the CPU nodes so that > driver core does not guarantee the probe order even when device links > are enabled (which they not always are). > > This series adds a new interconnect provider registration API which is > used to fix up the interconnect drivers before removing the old racy > API. > So is there a dependency or not? Can you make it clear that I shouldn't take memory controller bits? Best regards, Krzysztof
On Thu, Feb 02, 2023 at 12:13:33PM +0100, Krzysztof Kozlowski wrote: > On 01/02/2023 11:15, Johan Hovold wrote: > > The current interconnect provider interface is inherently racy as > > providers are expected to be registered before being fully initialised. > > > > This can specifically cause racing DT lookups to fail as I recently > > noticed when the Qualcomm cpufreq driver failed to probe: > > > > of_icc_xlate_onecell: invalid index 0 > > cpu cpu0: error -EINVAL: error finding src node > > cpu cpu0: dev_pm_opp_of_find_icc_paths: Unable to get path0: -22 > > qcom-cpufreq-hw: probe of 18591000.cpufreq failed with error -22 > > > > This only happens very rarely, but the bug is easily reproduced by > > increasing the race window by adding an msleep() after registering > > osm-l3 interconnect provider. > > > > Note that the Qualcomm cpufreq driver is especially susceptible to this > > race as the interconnect path is looked up from the CPU nodes so that > > driver core does not guarantee the probe order even when device links > > are enabled (which they not always are). > > > > This series adds a new interconnect provider registration API which is > > used to fix up the interconnect drivers before removing the old racy > > API. > > > > So is there a dependency or not? Can you make it clear that I shouldn't > take memory controller bits? As the fixes depend on the new API it is best if these could all go through Georgi's tree. Johan