Message ID | 20170814195832.20524-4-hdegoede@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 14, 2017 at 09:58:32PM +0200, Hans de Goede wrote: > intel_uncore_forcewake_reset() does forcewake puts and gets as such > we need to make sure that no-one tries to access the PUNIT->PMIC bus > (on systems where this bus is shared) while it runs, otherwise bad > things happen. > > Normally this is taken care of by the i915_pmic_bus_access_notifier() > which does an intel_uncore_forcewake_get(FORCEWAKE_ALL) when some other > driver tries to access the PMIC bus, so that later forcewake gets are > no-ops (for the duration of the bus access). > > But intel_uncore_forcewake_reset gets called in 3 cases: > 1) Before registering the pmic_bus_access_notifier > 2) After unregistering the pmic_bus_access_notifier > 3) To reset forcewake state on a GPU reset > > In all 3 cases the i915_pmic_bus_access_notifier() protection is > insufficient. > > This commit fixes this race by calling iosf_mbi_punit_acquire() before > calling intel_uncore_forcewake_reset(). In the case where it is called > directly after unregistering the pmic_bus_access_notifier, we need to > hold the punit-lock over both calls to avoid a race where > intel_uncore_fw_release_timer() may execute between the 2 calls. > > To allow holding the lock over both calls we need an unlocked > variant of iosf_mbi_unregister_pmic_bus_access_notifier. Since > intel_uncore.c is the only user of this function, we simply > modify it in this commit. Doing this in a separate commit would > require first adding an unlocked variant, then this commit and > then removing the unused normal variant. > > Signed-off-by: Hans de Goede <hdegoede@redhat.com> > --- > Changes in v2: > -Rebase on current (July 6th 2017) drm-next > > Changes in v3: > -Keep punit acquired / locked over the unregister + forcewake_reset > call combo to avoid a race hitting between the 2 calls > -This requires modifying iosf_mbi_unregister_pmic_bus_access_notifier > to not take the lock itself, since we are the only users this is done > in this same commit > --- > arch/x86/include/asm/iosf_mbi.h | 10 ++++++++-- > arch/x86/platform/intel/iosf_mbi.c | 14 +++++--------- > drivers/gpu/drm/i915/intel_uncore.c | 17 +++++++++++++---- > 3 files changed, 26 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/include/asm/iosf_mbi.h b/arch/x86/include/asm/iosf_mbi.h > index c313cac36f56..f8841bb06d98 100644 > --- a/arch/x86/include/asm/iosf_mbi.h > +++ b/arch/x86/include/asm/iosf_mbi.h > @@ -141,9 +141,14 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb); > /** > * iosf_mbi_register_pmic_bus_access_notifier - Unregister PMIC bus notifier You missed the rename in the doc. > * > + * Note the caller must call iosf_mbi_punit_acquire() before calling this > + * to ensure the bus is inactive before unregistering (and call > + * iosf_mbi_punit_release() afterwards). > + * > * @nb: notifier_block to unregister > */ > -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb); > +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( > + struct notifier_block *nb); > > /** > * iosf_mbi_call_pmic_bus_access_notifier_chain - Call PMIC bus notifier chain > @@ -191,7 +196,8 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb) > } > > static inline > -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb) > +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( > + struct notifier_block *nb) > { > return 0; > } > diff --git a/arch/x86/platform/intel/iosf_mbi.c b/arch/x86/platform/intel/iosf_mbi.c > index a952ac199741..5596a3ec1b89 100644 > --- a/arch/x86/platform/intel/iosf_mbi.c > +++ b/arch/x86/platform/intel/iosf_mbi.c > @@ -218,19 +218,15 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb) > } > EXPORT_SYMBOL(iosf_mbi_register_pmic_bus_access_notifier); > > -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb) > +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( > + struct notifier_block *nb) > { > - int ret; > + WARN_ON(!mutex_is_locked(&iosf_mbi_punit_mutex)); > > - /* Wait for the bus to go inactive before unregistering */ > - mutex_lock(&iosf_mbi_punit_mutex); > - ret = blocking_notifier_chain_unregister( > + return blocking_notifier_chain_unregister( > &iosf_mbi_pmic_bus_access_notifier, nb); > - mutex_unlock(&iosf_mbi_punit_mutex); > - > - return ret; > } > -EXPORT_SYMBOL(iosf_mbi_unregister_pmic_bus_access_notifier); > +EXPORT_SYMBOL(iosf_mbi_unregister_pmic_bus_access_notifier_unlocked); > > int iosf_mbi_call_pmic_bus_access_notifier_chain(unsigned long val, void *v) > { > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > index 569d115918eb..7be6150520ed 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -229,6 +229,7 @@ intel_uncore_fw_release_timer(struct hrtimer *timer) > return HRTIMER_NORESTART; > } > > +/* Note callers must have acquired the PUNIT->PMIC bus, before calling this. */ Nit: could use an iosf_assert_mbi_punit_mutex_held() helper instead of the comment. (taking CONFIG_IOSF_MBI into account) > static void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv, > bool restore) > { > @@ -416,14 +417,18 @@ static void __intel_uncore_early_sanitize(struct drm_i915_private *dev_priv, > GT_FIFO_CTL_RC6_POLICY_STALL); > } > > + iosf_mbi_punit_acquire(); > intel_uncore_forcewake_reset(dev_priv, restore_forcewake); > + iosf_mbi_punit_release(); > } > > void intel_uncore_suspend(struct drm_i915_private *dev_priv) > { > - iosf_mbi_unregister_pmic_bus_access_notifier( > + iosf_mbi_punit_acquire(); > + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( > &dev_priv->uncore.pmic_bus_access_nb); > intel_uncore_forcewake_reset(dev_priv, false); > + iosf_mbi_punit_release(); > } > > void intel_uncore_resume_early(struct drm_i915_private *dev_priv) > @@ -1246,12 +1251,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv) > > void intel_uncore_fini(struct drm_i915_private *dev_priv) > { > - iosf_mbi_unregister_pmic_bus_access_notifier( > - &dev_priv->uncore.pmic_bus_access_nb); > - > /* Paranoia: make sure we have disabled everything before we exit. */ > intel_uncore_sanitize(dev_priv); > + > + iosf_mbi_punit_acquire(); > + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( > + &dev_priv->uncore.pmic_bus_access_nb); > intel_uncore_forcewake_reset(dev_priv, false); > + iosf_mbi_punit_release(); > } > > #define GEN_RANGE(l, h) GENMASK((h) - 1, (l) - 1) > @@ -1524,7 +1531,9 @@ static int gen6_reset_engines(struct drm_i915_private *dev_priv, > > ret = gen6_hw_domain_reset(dev_priv, hw_mask); > > + iosf_mbi_punit_acquire(); > intel_uncore_forcewake_reset(dev_priv, true); > + iosf_mbi_punit_release(); > > return ret; > } There is one more spot in intel_uncore_check_forcewake_domains() calling intel_uncore_forcewake_reset() you missed. With the above fixes and optional change for the nit looks ok: Reviewed-by: Imre Deak <imre.deak@intel.com> > -- > 2.13.4 >
Hi, On 17-08-17 14:22, Imre Deak wrote: > On Mon, Aug 14, 2017 at 09:58:32PM +0200, Hans de Goede wrote: >> intel_uncore_forcewake_reset() does forcewake puts and gets as such >> we need to make sure that no-one tries to access the PUNIT->PMIC bus >> (on systems where this bus is shared) while it runs, otherwise bad >> things happen. >> >> Normally this is taken care of by the i915_pmic_bus_access_notifier() >> which does an intel_uncore_forcewake_get(FORCEWAKE_ALL) when some other >> driver tries to access the PMIC bus, so that later forcewake gets are >> no-ops (for the duration of the bus access). >> >> But intel_uncore_forcewake_reset gets called in 3 cases: >> 1) Before registering the pmic_bus_access_notifier >> 2) After unregistering the pmic_bus_access_notifier >> 3) To reset forcewake state on a GPU reset >> >> In all 3 cases the i915_pmic_bus_access_notifier() protection is >> insufficient. >> >> This commit fixes this race by calling iosf_mbi_punit_acquire() before >> calling intel_uncore_forcewake_reset(). In the case where it is called >> directly after unregistering the pmic_bus_access_notifier, we need to >> hold the punit-lock over both calls to avoid a race where >> intel_uncore_fw_release_timer() may execute between the 2 calls. >> >> To allow holding the lock over both calls we need an unlocked >> variant of iosf_mbi_unregister_pmic_bus_access_notifier. Since >> intel_uncore.c is the only user of this function, we simply >> modify it in this commit. Doing this in a separate commit would >> require first adding an unlocked variant, then this commit and >> then removing the unused normal variant. >> >> Signed-off-by: Hans de Goede <hdegoede@redhat.com> >> --- >> Changes in v2: >> -Rebase on current (July 6th 2017) drm-next >> >> Changes in v3: >> -Keep punit acquired / locked over the unregister + forcewake_reset >> call combo to avoid a race hitting between the 2 calls >> -This requires modifying iosf_mbi_unregister_pmic_bus_access_notifier >> to not take the lock itself, since we are the only users this is done >> in this same commit >> --- >> arch/x86/include/asm/iosf_mbi.h | 10 ++++++++-- >> arch/x86/platform/intel/iosf_mbi.c | 14 +++++--------- >> drivers/gpu/drm/i915/intel_uncore.c | 17 +++++++++++++---- >> 3 files changed, 26 insertions(+), 15 deletions(-) >> >> diff --git a/arch/x86/include/asm/iosf_mbi.h b/arch/x86/include/asm/iosf_mbi.h >> index c313cac36f56..f8841bb06d98 100644 >> --- a/arch/x86/include/asm/iosf_mbi.h >> +++ b/arch/x86/include/asm/iosf_mbi.h >> @@ -141,9 +141,14 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb); >> /** >> * iosf_mbi_register_pmic_bus_access_notifier - Unregister PMIC bus notifier > > You missed the rename in the doc. Thx, fixed for v4. >> * >> + * Note the caller must call iosf_mbi_punit_acquire() before calling this >> + * to ensure the bus is inactive before unregistering (and call >> + * iosf_mbi_punit_release() afterwards). >> + * >> * @nb: notifier_block to unregister >> */ >> -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb); >> +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( >> + struct notifier_block *nb); >> >> /** >> * iosf_mbi_call_pmic_bus_access_notifier_chain - Call PMIC bus notifier chain >> @@ -191,7 +196,8 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb) >> } >> >> static inline >> -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb) >> +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( >> + struct notifier_block *nb) >> { >> return 0; >> } >> diff --git a/arch/x86/platform/intel/iosf_mbi.c b/arch/x86/platform/intel/iosf_mbi.c >> index a952ac199741..5596a3ec1b89 100644 >> --- a/arch/x86/platform/intel/iosf_mbi.c >> +++ b/arch/x86/platform/intel/iosf_mbi.c >> @@ -218,19 +218,15 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb) >> } >> EXPORT_SYMBOL(iosf_mbi_register_pmic_bus_access_notifier); >> >> -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb) >> +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( >> + struct notifier_block *nb) >> { >> - int ret; >> + WARN_ON(!mutex_is_locked(&iosf_mbi_punit_mutex)); >> >> - /* Wait for the bus to go inactive before unregistering */ >> - mutex_lock(&iosf_mbi_punit_mutex); >> - ret = blocking_notifier_chain_unregister( >> + return blocking_notifier_chain_unregister( >> &iosf_mbi_pmic_bus_access_notifier, nb); >> - mutex_unlock(&iosf_mbi_punit_mutex); >> - >> - return ret; >> } >> -EXPORT_SYMBOL(iosf_mbi_unregister_pmic_bus_access_notifier); >> +EXPORT_SYMBOL(iosf_mbi_unregister_pmic_bus_access_notifier_unlocked); >> >> int iosf_mbi_call_pmic_bus_access_notifier_chain(unsigned long val, void *v) >> { >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c >> index 569d115918eb..7be6150520ed 100644 >> --- a/drivers/gpu/drm/i915/intel_uncore.c >> +++ b/drivers/gpu/drm/i915/intel_uncore.c >> @@ -229,6 +229,7 @@ intel_uncore_fw_release_timer(struct hrtimer *timer) >> return HRTIMER_NORESTART; >> } >> >> +/* Note callers must have acquired the PUNIT->PMIC bus, before calling this. */ > > Nit: could use an iosf_assert_mbi_punit_mutex_held() helper instead of > the comment. (taking CONFIG_IOSF_MBI into account) Good idea, done for v4 (as iosf_mbi_assert_punit_acquired). >> static void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv, >> bool restore) >> { >> @@ -416,14 +417,18 @@ static void __intel_uncore_early_sanitize(struct drm_i915_private *dev_priv, >> GT_FIFO_CTL_RC6_POLICY_STALL); >> } >> >> + iosf_mbi_punit_acquire(); >> intel_uncore_forcewake_reset(dev_priv, restore_forcewake); >> + iosf_mbi_punit_release(); >> } >> >> void intel_uncore_suspend(struct drm_i915_private *dev_priv) >> { >> - iosf_mbi_unregister_pmic_bus_access_notifier( >> + iosf_mbi_punit_acquire(); >> + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( >> &dev_priv->uncore.pmic_bus_access_nb); >> intel_uncore_forcewake_reset(dev_priv, false); >> + iosf_mbi_punit_release(); >> } >> >> void intel_uncore_resume_early(struct drm_i915_private *dev_priv) >> @@ -1246,12 +1251,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv) >> >> void intel_uncore_fini(struct drm_i915_private *dev_priv) >> { >> - iosf_mbi_unregister_pmic_bus_access_notifier( >> - &dev_priv->uncore.pmic_bus_access_nb); >> - >> /* Paranoia: make sure we have disabled everything before we exit. */ >> intel_uncore_sanitize(dev_priv); >> + >> + iosf_mbi_punit_acquire(); >> + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( >> + &dev_priv->uncore.pmic_bus_access_nb); >> intel_uncore_forcewake_reset(dev_priv, false); >> + iosf_mbi_punit_release(); >> } >> >> #define GEN_RANGE(l, h) GENMASK((h) - 1, (l) - 1) >> @@ -1524,7 +1531,9 @@ static int gen6_reset_engines(struct drm_i915_private *dev_priv, >> >> ret = gen6_hw_domain_reset(dev_priv, hw_mask); >> >> + iosf_mbi_punit_acquire(); >> intel_uncore_forcewake_reset(dev_priv, true); >> + iosf_mbi_punit_release(); >> >> return ret; >> } > > There is one more spot in intel_uncore_check_forcewake_domains() calling > intel_uncore_forcewake_reset() you missed. Ugh, I did not check for intel_uncore_forcewake_reset() usage outside of intel_uncore.c as it is a static function, the include magic used for the selftests caught me by suprise there. > With the above fixes and optional change for the nit looks ok: > Reviewed-by: Imre Deak <imre.deak@intel.com> Thanks, v4 with everything fixed is coming up. Regards, Hans
diff --git a/arch/x86/include/asm/iosf_mbi.h b/arch/x86/include/asm/iosf_mbi.h index c313cac36f56..f8841bb06d98 100644 --- a/arch/x86/include/asm/iosf_mbi.h +++ b/arch/x86/include/asm/iosf_mbi.h @@ -141,9 +141,14 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb); /** * iosf_mbi_register_pmic_bus_access_notifier - Unregister PMIC bus notifier * + * Note the caller must call iosf_mbi_punit_acquire() before calling this + * to ensure the bus is inactive before unregistering (and call + * iosf_mbi_punit_release() afterwards). + * * @nb: notifier_block to unregister */ -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb); +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( + struct notifier_block *nb); /** * iosf_mbi_call_pmic_bus_access_notifier_chain - Call PMIC bus notifier chain @@ -191,7 +196,8 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb) } static inline -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb) +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( + struct notifier_block *nb) { return 0; } diff --git a/arch/x86/platform/intel/iosf_mbi.c b/arch/x86/platform/intel/iosf_mbi.c index a952ac199741..5596a3ec1b89 100644 --- a/arch/x86/platform/intel/iosf_mbi.c +++ b/arch/x86/platform/intel/iosf_mbi.c @@ -218,19 +218,15 @@ int iosf_mbi_register_pmic_bus_access_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(iosf_mbi_register_pmic_bus_access_notifier); -int iosf_mbi_unregister_pmic_bus_access_notifier(struct notifier_block *nb) +int iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( + struct notifier_block *nb) { - int ret; + WARN_ON(!mutex_is_locked(&iosf_mbi_punit_mutex)); - /* Wait for the bus to go inactive before unregistering */ - mutex_lock(&iosf_mbi_punit_mutex); - ret = blocking_notifier_chain_unregister( + return blocking_notifier_chain_unregister( &iosf_mbi_pmic_bus_access_notifier, nb); - mutex_unlock(&iosf_mbi_punit_mutex); - - return ret; } -EXPORT_SYMBOL(iosf_mbi_unregister_pmic_bus_access_notifier); +EXPORT_SYMBOL(iosf_mbi_unregister_pmic_bus_access_notifier_unlocked); int iosf_mbi_call_pmic_bus_access_notifier_chain(unsigned long val, void *v) { diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 569d115918eb..7be6150520ed 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -229,6 +229,7 @@ intel_uncore_fw_release_timer(struct hrtimer *timer) return HRTIMER_NORESTART; } +/* Note callers must have acquired the PUNIT->PMIC bus, before calling this. */ static void intel_uncore_forcewake_reset(struct drm_i915_private *dev_priv, bool restore) { @@ -416,14 +417,18 @@ static void __intel_uncore_early_sanitize(struct drm_i915_private *dev_priv, GT_FIFO_CTL_RC6_POLICY_STALL); } + iosf_mbi_punit_acquire(); intel_uncore_forcewake_reset(dev_priv, restore_forcewake); + iosf_mbi_punit_release(); } void intel_uncore_suspend(struct drm_i915_private *dev_priv) { - iosf_mbi_unregister_pmic_bus_access_notifier( + iosf_mbi_punit_acquire(); + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( &dev_priv->uncore.pmic_bus_access_nb); intel_uncore_forcewake_reset(dev_priv, false); + iosf_mbi_punit_release(); } void intel_uncore_resume_early(struct drm_i915_private *dev_priv) @@ -1246,12 +1251,14 @@ void intel_uncore_init(struct drm_i915_private *dev_priv) void intel_uncore_fini(struct drm_i915_private *dev_priv) { - iosf_mbi_unregister_pmic_bus_access_notifier( - &dev_priv->uncore.pmic_bus_access_nb); - /* Paranoia: make sure we have disabled everything before we exit. */ intel_uncore_sanitize(dev_priv); + + iosf_mbi_punit_acquire(); + iosf_mbi_unregister_pmic_bus_access_notifier_unlocked( + &dev_priv->uncore.pmic_bus_access_nb); intel_uncore_forcewake_reset(dev_priv, false); + iosf_mbi_punit_release(); } #define GEN_RANGE(l, h) GENMASK((h) - 1, (l) - 1) @@ -1524,7 +1531,9 @@ static int gen6_reset_engines(struct drm_i915_private *dev_priv, ret = gen6_hw_domain_reset(dev_priv, hw_mask); + iosf_mbi_punit_acquire(); intel_uncore_forcewake_reset(dev_priv, true); + iosf_mbi_punit_release(); return ret; }
intel_uncore_forcewake_reset() does forcewake puts and gets as such we need to make sure that no-one tries to access the PUNIT->PMIC bus (on systems where this bus is shared) while it runs, otherwise bad things happen. Normally this is taken care of by the i915_pmic_bus_access_notifier() which does an intel_uncore_forcewake_get(FORCEWAKE_ALL) when some other driver tries to access the PMIC bus, so that later forcewake gets are no-ops (for the duration of the bus access). But intel_uncore_forcewake_reset gets called in 3 cases: 1) Before registering the pmic_bus_access_notifier 2) After unregistering the pmic_bus_access_notifier 3) To reset forcewake state on a GPU reset In all 3 cases the i915_pmic_bus_access_notifier() protection is insufficient. This commit fixes this race by calling iosf_mbi_punit_acquire() before calling intel_uncore_forcewake_reset(). In the case where it is called directly after unregistering the pmic_bus_access_notifier, we need to hold the punit-lock over both calls to avoid a race where intel_uncore_fw_release_timer() may execute between the 2 calls. To allow holding the lock over both calls we need an unlocked variant of iosf_mbi_unregister_pmic_bus_access_notifier. Since intel_uncore.c is the only user of this function, we simply modify it in this commit. Doing this in a separate commit would require first adding an unlocked variant, then this commit and then removing the unused normal variant. Signed-off-by: Hans de Goede <hdegoede@redhat.com> --- Changes in v2: -Rebase on current (July 6th 2017) drm-next Changes in v3: -Keep punit acquired / locked over the unregister + forcewake_reset call combo to avoid a race hitting between the 2 calls -This requires modifying iosf_mbi_unregister_pmic_bus_access_notifier to not take the lock itself, since we are the only users this is done in this same commit --- arch/x86/include/asm/iosf_mbi.h | 10 ++++++++-- arch/x86/platform/intel/iosf_mbi.c | 14 +++++--------- drivers/gpu/drm/i915/intel_uncore.c | 17 +++++++++++++---- 3 files changed, 26 insertions(+), 15 deletions(-)